Shift cells up/down in same column in Power Query

In my previous post I wrote about one interesting technique used by my colleague Zoltán Kaszaki-Krsjak.

Categoty_tmp = Table.AddColumn(Buffer, "Category_tmp", each Buffer[Category]{[ID2]}?),
// It helps to shift values of column [Category] one row upwards.

It is a trick that you most probably will not use in any of your solutions. However, worth to know how it works and how to do this without adding new columns.

“each Buffer[Category]{[ID2]}?” is just a function, so we can combine it with methods described in one of my posts (Transform Column Using Custom Function).

Here is the code.

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Duplicated Column" = Table.DuplicateColumn(Source, "Category", "Method1"),
#"Duplicated Column1" = Table.DuplicateColumn(#"Duplicated Column", "Category", "Method2"),
#"Duplicated Column2" = Table.DuplicateColumn(#"Duplicated Column1", "Category", "Method3"),
Buffer = Table.Buffer( #"Duplicated Column2" ),

Method1 = Table.FromRecords( Table.TransformRows( Buffer, each [Index = [Index],
    Category = [Category],
    Method1 = Buffer[Method1]{[Index]}?,
    Method2 = [Method2],
    Method3 = [Method3] ] ) ),

Method2 = Table.FromRecords( Table.TransformRows( Method1, (row) =>
    Record.TransformFields( row,
    {"Method2", each Buffer[Method2]{ row[Index] }? } ) ) ),
    // method offered by Miguel Escobar in comment to previous post

Method3 = Table.ReplaceValue(Method2, each [Method3], each Buffer[Method3]{[Index]}?, Replacer.ReplaceValue, {"Method3"})
in
Method3

File with methods is here.

But which method is faster?

Continue reading

Bar-Mekko chart in Excel with Power Query

Seems, Excel charts is an area that till now wasn’t considered in blogs as a target for Power Query application (Get & Transform in Excel 2016).

Nevertheless, PQ can replace some VBA solutions and make your workbooks macro-free.

In far 2015 my colleague Zoltán Kaszaki-Krsjak shared with me a very good example of how Power Query can help with generation of specific tables for specific charts, which are widely used in our organization.

Idea to write a blog post about this technique became dusty in me OneNote, and probably would wait more if only Jon Peltier hadn’t attracted my attention to this topic again by his recent post.

Sample workbook contains a solution for Bar-Mekko chart (or “variable width column chart”)


Such chart allows to easily see share of categories, growth or absolute value. Can be used to compare market segments or productivity of departments / subsidiaries. Red line in this case shows average growth – another small but important detail.

Interested how to build it?

Continue reading

Query folding and dynamic parameters in SQL Queries in Power Query

Before you start reading this article, I recommend you to go through very good post on Ken Puls’s blog – Pass Parameters to SQL Queries.

If it is fine for you to use native queries to database then most probably scenario described below is not so interesting for you.

Here I’ll talk about workaround allowing to use query folding and pass parameters in Power Query without building of native SQL query.

Level: intermediate

Task definition

Generate N workbooks with data models populated with limited data scope.

Then each workbook will be shared with certain group of people, that allowed to see only relevant set of data (e.g. department OPEX of particular subsidiary).

This is dictated by fact, that if you share Excel workbook with someone – user immediately has access to all info inside data model. So, I cannot create one common data model and hide slicer somewhere with activated department. I must prepare separated workbooks.

Conditions

Data: Table with millions of records in SQL database – main data source for report (query folding required for obvious reasons)

Tools: Excel + Power Query + PowerPivot

Support data: table in Excel, defining datasets – data scopes, let’s call it Control Table

Continue reading

Power Query: Xlsb vs Xlsx

Recently I helped my colleague with a report. The task was to consolidate data from multiple Excel files, each contains 100k-500k rows. Hence, size of each Excel file is relatively big.

I proposed to use xlsb to store those files. One file per month, report needs R24m, so 50 MB is less than 150 MB.

In the end I finished with deep analysis what is better as Excel-files database for Power Query – XLSB or XLSX.

Resume

For PQ solutions use XLSX instead of XLSB

PQ from XLSX twice faster than from XLSB

Engine handling XLSB from time to time return error – cannot read file

“UseHeaders” parameter of Excel.Workbook function works differently for xlsx and xlsb

If sheet and table names are equal in Excel workbook – Navigator generates unique name of object.

If you interested how I revealed all this – continue reading.

Continue reading

Several ways of data extraction from SAP Business ByDesign

There are several reasons to extract data from corporate ERP system.

First one and the most important – reporting. Nicely visualized data, usually pre-calculated using specific logic before. We all want to see beautiful reports.

Second, I think, is a data transfer to: BI system(s), DWH etc. Maybe for some companies this is a first priority instead of reporting.

Third, rare, but happens in my practice – data migration due to system “end-of-life”. Yes, this happens. What to say? In a moment I’m involved in such project, second in my practice.

I would like to share my experience. Hope it will simplify life of many other reporting specialists working with SAP Business ByDesign.
First of all, basic ways of getting data:
Continue reading

Xlsb vs Xlsx PowerPivot model

I bet over the Internet can be found number of posts with comparison of workbook size with different formats. However, nothing is better than own data :).

Assume that we have an Excel file with quite big amount of data

~250k rows


As we already know XLSB format makes file size much smaller than XLSX


But what if we load data into Data Model of empty workbook – will it be smaller than xlsb?

To check this I created new Excel workbook, went to PowerPivot tab and opened Data Model window


I need import from Excel

It is stored under “From Other Source” – scroll down to the end


Browse for my source file (I took xlsx as a source)


On the next step I can make “Preview and Filter” to exclude unnecessary data.

You may see checkboxes and usual dropdown filter buttons on columns – like in Excel I can select what I need in filter menu


For this “size challenge” I should take everything. So simply press “Finish” and wait while PowerPivot imports data.

After few seconds we may see result. All rows were imported – no errors.


Result in PowerPivot window


Save this file as xlsx and xlsb


Champion still is XLSB with simple table. PP version size is 11% greater.

And we may see minimal difference between PP versions in different formats. xlsx and xlsb have almost the same size.

And what interesting – xlsb is slightly bigger than xlsx in this case.

Let’s make one more test. Delete text column from initial file. It contains comments for transactions.


This column contains huge number of unique values, in addition it is text – the most heaviest data type.

I repeated same actions with file without text column and result is opposite


PP version has 11% less size than initial “xlsb without text column” and 4.5 times less than initial Without text column.xlsx.

Thoughts that come in my mind after all

Choice where to store data, in Excel table or in PowerPivot, depends on data. There are pros and cons. In the end, PowerPivot don’t lose. Usually data models contain more numbers and less text.

Work with 250k rows in Excel so slow that close to impossible, several SUMIFs will raise a “lack of memory error”. Instead of SUMIFS PivotTables can be used and then GETPIVOTDATA function… All of this currently “old style” modelling that helped us a lot before PowerPivot came – should stay in the past.

What if we want analyze R12M and take 12 files? We will have >1M rows, that is not supported by Excel sheets, BUT PowerPivot supports and still has quick performance of calculation.

Happy modelling in 2016!

Generation of custom Calendars in Power Query

In my previous post I described how to build relative date functions in Power Query

Associated topic is a generation of Calendar table for Excel Data Model (aka PowerPivot) where better to have only necessary period of time. And certain list of dates in some cases, e.g. ends of last X month if we analyse month results.

Assume that we have model with sales of previous month. Quite often in such model we don’t need dates of previous of future year and when we refresh such model in the beginning of new month our Calendar should be refreshed as well.

I took idea of calendar generation from post on http://powerpivotpro.comhttp://www.powerpivotpro.com/2015/02/create-a-custom-calendar-in-power-query/, thanks to Matt Allington.

Functions List.Dates and List.Generate can help us get initial list of dates and then extend it to useful calendar with long month names, short month names etc.

Continue reading

Excel online. How it works with dates in text format

Using DATEVALUE isn’t a good idea if you publish workbook on SharePoint. You have to be sure that locale at your computer is the same as on SharePoint site, where you publish workbook.

090815_0608_Excelonline1.png

In this example date presented as text. SAP ByDesign always shows dates as text in reports. Text depends on user parameters set in ByD.

It can be DD.MM.YYYY or MM/DD/YYYY, or something else. Unfortunately, from report side we cannot get info about set date format. In our region we agreed that default date format that users should use is DD.MM.YYYY.

Return to DATEVALUE. It is a function that determine date from text string. Reasonable question – what is ‘03.04.2015’ (or ‘04.03.2015’). If you don’t know what date format was used DD.MM.YYYY or MM.DD.YYYY, you cannot answer. But we need answer, therefore Excel uses locale set in “Region and language” parameters.

For example, on your computer date format is DD.MM.YYYY and “28.02.2015” is recognized by DATEVALUE as 28-Feb-2015 – cool.

However, when you upload workbook on SharePoint site where locale set to English (U.S.), DATEVALUE in workbook tries to read date as MM/DD/YYYY, but there is no slash (/) sign in “28.02.2015” – first fail of formula. Even if it would be “28/02/2015”, do we have month “28” in calendar? – second fail.

Instead of using DATEVALUE I would recommend to manually cut string on parts using LEFT, MID, RIGHT functions.

If you are sure that first to characters shows day, month in the middle of string with two characters, year is last 4 characters, then

DATE(RIGHT(A1, 4), MID(A1, 4, 2), LEFT(A1, 2)) – will give you right date. This option works without dependency on locale of computer or SharePoint site.

Dynamic Table Headers in Power Query (SAP ByDesign, Odata)

Another good example of how awesome is Power Query. And again I reference to one of well-known bloggers, Ken Puls’s post describes situation when we have pivoted data and need to unpivot them then rename columns. Fortunalety, in this example not so many columns used and unpivoting + join “rename table” is a solution.
However, in common case when we have table with 20 columns and 100 000 rows (or 2 million) unpivoting will be a very bad decision.
I faced with this challenge when I started to work with Odata protocol in SAP Business ByDesign.

Using Odata we have to provide string with list of fields ID in $select, for example query of Account Contact Data:

https://myXXXXXX.sapbydesign.com/sap/byd/odata/cc_home_analytics.svc/RPBPCSCONTB_Q0001QueryResults? $top=100000&$select=CBO_ROLE_CODE,CBP_STATUS_CODE,CBP_UUID,CCONT_FUNCTIONAL_AREA_CODE, CCONT_FUNCTION_TYPE_CODE,CCONT_MAIN_INDICATOR,CCONT_UUID,CCONT_WPA_BUILDING,CCONT_WPA_DEPARTMENT_NAME, CCONT_WPA_EMAIL_CONTENT,CCONT_WPA_FAX_NUMBER,CCONT_WPA_FLOOR,CCONT_WPA_FUNCTIONAL_TITLE, CCONT_WPA_MOBILE_NUMBER, CCONT_WPA_PHONE_NUMBER,CCONT_WPA_POSTAL_ADR_DESCR,CCONT_WPA_PREF_COMMU_MEDIUM, CCONT_WPA_ROOM_ID,CCURR_EMPL_RESP_PARTY_ROLE_CODE,CCURR_EMPL_RESP_UUID_CONTENT,CDEFADDR_FORM_POSTAL_ADDR, FK0COUNT, KK0COUNT,TBO_ROLE_CODE,TBP_STATUS_CODE,TBP_UUID,TCONT_FUNCTIONAL_AREA_CODE, TCONT_FUNCTION_TYPE_CODE, TCONT_UUID,TCONT_WPA_PREF_COMMU_MEDIUM,TCURR_EMPL_RESP_PARTY_ROLE_CODE, TCURR_EMPL_RESP_UUID_CONTENT& $format=json

Continue reading