A Prospect Dev Field Notebook
  • Home
  • About
  • Field Notes

On this page

  • What is PowerQuery?
    • Key Features
  • PowerQuery vs. VLookup
  • Use-cases
    • Abstract Example
    • Steps to Achieve This
  • Example from the Fundraising Job Title dataset project
    • Power Query M Code for Filtering Job Titles with Dynamic Keywords
      • Dynamic Keyword Parameters
      • Load and Filter Nonprofit Dataset
      • Reusable Filtering Function
      • Dataset Paths and Execution
      • Summary
  • References

Using PowerQuery (part 1)

Paid tool
Tutorials
Author

Greg Brooks

Published

October 3, 2025

What is PowerQuery?

Power Query is a data connection technology in Excel that allows users to connect, combine, and refine data from various sources.

Key Features

  • Data Importing: Pulls data from multiple sources (databases, web pages, files).
  • Data Transformation: Offers tools to clean and reshape data, such as filtering, merging, and aggregating.
  • User-Friendly Interface: Provides a visual interface for users to apply transformations without coding.
  • Integration: Seamlessly integrates with Excel, allowing transformed data to be loaded into worksheets for analysis.

In essence, Power Query simplifies data preparation, enabling users to focus on analysis and insights.

PowerQuery vs. VLookup

Feature Power Query VLOOKUP
Purpose Data transformation and preparation Look up and retrieve specific data values
Data Sources Connects to multiple sources (Excel, SQL, web, etc.) Limited to data within the worksheet
Complexity More complex with advanced capabilities Simpler and straightforward
Output Creates tables or queries in Excel Returns a single value based on a match
Flexibility Highly versatile, allows for data shaping Less flexible; limited to a single match type
Use Cases Ideal for ETL processes and bulk data manipulation Best for simple lookups within a single dataset
Maintenance Requires more setup but easier to update over time Simple to set up but can be error-prone if data changes

Use-cases

Your workflow may not require PowerQuery, it may be overkill. But given how much fiddling with Excel formulas can happen, and how that can eat up time, PowerQuery honestly seems like it should be more standard in our field - especially because of how reproducible any given process is - the steps being recorded automatically so they could be reused or made into a template.

Abstract Example

You have data in your CRM that you want to combine with another dataset you’ve found or created. For example, perhaps you’ve created a custom list of individuals who have won awards, but have not updated each of their individual profiles in the database yet. So you are looking to add this supplemental info to help segment a list in an interesting way.

Steps to Achieve This

  1. Export Data from CRM: Start by exporting the relevant data from your CRM into a CSV or Excel format.
  2. Import Data into Power Query: Open Excel, go to the Data tab, and select “Get Data” to import your CRM data and your custom dataset.
  3. Combine Datasets: Use Power Query tools to merge or append the datasets based on a common identifier.
  4. Transform Data: Clean, filter, or reshape the data as necessary.
  5. Load the Combined Data: Once satisfied with the combined data, load it back into Excel for analysis or reporting.

More complex data engineering can be done with PowerQuery, and business intelligence analytics with PowerBI

Example from the Fundraising Job Title dataset project

Power Query M Code for Filtering Job Titles with Dynamic Keywords

This document outlines a Power Query M script that dynamically loads keyword lists from parameters and filters job title datasets across multiple sectors. The goal is to identify roles related to Major Gifts while excluding irrelevant ones.

Dynamic Keyword Parameters

Instead of hardcoding the keywords, we define them as parameters:

let
    IncludeKeywords = Text.Split("Major Gift,Director Of Development,Development Officer,Major Gift Officer,Fundraiser,Philanthropy Officer,Philanthropy Manager", ","),
    ExcludeKeywords = Text.Split("Prospect Research,Event", ",")

Load and Filter Nonprofit Dataset

We load the nonprofit dataset and filter it using the dynamic keyword lists:

    Source = Csv.Document(File.Contents("..\jobtitlesdataset\Figure 4\dataset\titles\map\55198385_nonprofit.csv"), [Delimiter=",", Columns=7, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    ChangedType = Table.TransformColumnTypes(PromotedHeaders, {
        {"sector", type text}, 
        {"title_cleaned", type text}, 
        {"frequency_cleaned", Int64.Type}, 
        {"title_generalized", type text}, 
        {"frequency_generalized", Int64.Type}, 
        {"title_simplified", type text}, 
        {"frequency_simplified", Int64.Type}
    }),
    IncludedRows = Table.SelectRows(ChangedType, each List.AnyTrue(List.Transform(IncludeKeywords, (kw) => Text.Contains([title_cleaned], kw)))),
    FinalFilteredRows = Table.SelectRows(IncludedRows, each List.AllTrue(List.Transform(ExcludeKeywords, (kw) => not Text.Contains([title_cleaned], kw)))),
    TitleList = Table.SelectColumns(FinalFilteredRows, {"title_cleaned"}),

Reusable Filtering Function

We define a reusable function to apply the same filtering logic to other datasets:

    FilterDataset = (filePath as text) =>
        let
            Source = Csv.Document(File.Contents(filePath), [Delimiter=",", Columns=7, Encoding=65001, QuoteStyle=QuoteStyle.None]),
            Promoted = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
            ChangedType = Table.TransformColumnTypes(Promoted, {
                {"sector", type text}, 
                {"title_cleaned", type text}, 
                {"frequency_cleaned", Int64.Type}, 
                {"title_generalized", type text}, 
                {"frequency_generalized", Int64.Type}, 
                {"title_simplified", type text}, 
                {"frequency_simplified", Int64.Type}
            }),
            Matched = Table.Join(ChangedType, "title_cleaned", TitleList, "title_cleaned", JoinKind.Inner)
        in
            Matched,

Dataset Paths and Execution

We list all dataset paths and apply the filtering function:

    DatasetPaths = {
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198352_travel_and_tourism.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198376_transportation_and_logistics.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55197857_restaurants_bars_and_food_services.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198349_real_estate.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198379_telecommunications.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198388_oil_gas_energy_and_utilities.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198394_retail.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198373_media.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198403_manufacturing.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198358_insurance.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198406_information_technology.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198382_government.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198397_health_care.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198391_education.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198400_finance.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198190_consumer_services.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198364_construction_repair_and_maintenance.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198409_business_services.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55197803_agriculture_and_forestry.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198355_arts_entertainment_and_recreation.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198370_biotech_and_pharmaceuticals.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198361_aerospace_and_defense.csv",
        "..\jobtitlesdataset\Figure 4\dataset\titles\map\55198367_accounting_and_legal.csv"
    },
    FilteredTables = List.Transform(DatasetPaths, each FilterDataset(_)),
    CombinedResults = Table.Combine(FilteredTables)
in
    CombinedResults

Summary

This script dynamically loads keyword lists and applies consistent filtering across multiple datasets. It improves maintainability and scalability by avoiding hardcoded values.


References

  • Disclaimer - This content was summarized or partially generated by AI.

Back to top