Understanding Master Data Management Using Excel and PowerPivot

Executive Summary

  • Master data management can be performed with Excel and PowerPivot.
  • The common problem that some applications have with regards to master data maintenance.

Introduction to Master Data Management

I work with some best-of-breed solutions all across supply chain planning. Almost all of these applications have master data interfaces that take the effort to learn, and each is different from the other applications I work with. I was recently working with a production scheduling application that I happen to like. After I had created resources, I spent around 20 minutes figuring out how to adjust their name. SAP has the worst and most time-consuming master data management back-end I have used among the many applications. However, the fact is that the vast majority of them are just not that good. This makes me spend a lot of time entering data. In fact, as I am writing this article, I have spent the better part of a day creating a model, and I will be working on this for another two days. This has required navigating a user interface I am not familiar with.

However, I am very familiar with Excel, so what value is being added by relearning another data interface? Secondly, if I want to extract this model or alter it, it will take even more time. This is an enormous waste of my time because the model is just the skeleton. The stuff I am testing is the application logic, so the more time I spend setting up the frame, the less time I can spend on the logic.

Modeling and Master Data with Excel

Of all the applications I have ever worked with, I vastly prefer the modeling and master data simplicity of DemandWorks Smoothie over all the applications I have used. This is because DemandWorks allows its system to have all of its master data setup in a spreadsheet. Smoothie works because multiple tabs are set up in a spreadsheet, and each tab has a specific predefined name. When the spreadsheet is imported into the model, the data is pulled from the tabs and applied to the table, which is the same name as the tab in the spreadsheet. Not all the tabs are mandatory. The application turns on the functionality when tabs are included in the import file and does not when the tab is not imported and the functionality is not activated. The system is simple, sustainable, and brilliant. I can get Smoothie up and run faster than any other system I have ever used, and it is highly sustainable. For larger models, Smoothie runs off of a SQL database. However, even large models can be run from spreadsheets.

Another application that can read Excel files is MatLab, the scientific computing application. It has a plug-in, called Builder EX (Excel), which allows the application to pull directly from a spreadsheet rather than the internal arrays or an external database that the application typically uses. In this case, the spreadsheet is used as the data-store rather than being loaded into a database (as with Demand Works)

The Portability of Models

All systems should have the ability to port models between systems. For instance, porting data between SAP systems is always a huge endeavor. However, this is because the data back-end with SAP is unnecessarily complicated. This makes porting master data between systems highly inefficient because what is being done with all this effort is so elementary. It’s effortless. If it takes extraordinary effort and plans to do something basic, you are dealing with an inefficient design. The actual productivity of SAP infrastructure resources is appalling because of the overall weakness of designing the applications they have to work with. Best breed vendors are certainly better, but everyone I have worked with within the supply chain space lags in Demand Works.

Excel and Record Limitations

Some people might say that this approach is not possible because of Excel’s record limitation. However, PowerPivot, a Microsoft plug-in for Excel, can make the number of rows or records that Excel can handle only limited by the computer’s memory. According to the book PowerPivot for the data analyst, a 64 bit Windows machine with 8 GB of ram can hold at least 100 million rows. That is not a very large hardware specification.

There is a server version that costs money. However, the PC version of the software is free, as described in this article.

They have this interesting quote regarding PowerPivot. The capacity described here is incredible.

“Nearly a billion rows in Excel. If you have a 64-bit version of Office 2010, you are limited by available memory on the machine (presumably much larger than the old 3GB limit in 32-bit Windows) and by the 2GB File Size limit. However, since PowerPivot can achieve 15:1 data compression, you can open 30GB data sets in PowerPivot. For a theoretical file with 7 columns, I calculate about 995 million rows as a theoretical limit.” – PowerPivot.WordPress.com

Secondly, PowerPivot is only one plug-in. However, there are several others. (interestingly, PowerPivot is a bit of a pain to buy). Others are Vizubi and QlikView. Yet, I tried both of these and found them to be more reporting-oriented than data management-driven.

https://vizubi.com/

https://www.qlikview.com/

With all the talk of different master data management software, it may be that the best master data management is right in front of us, enhanced with a plug-in and run on a powerful computer. This, of course, extends past data management and into business intelligence, but I will stop here because that gets into another large topic. However, there is a lot more smoke and effort going into business intelligence coming out in usable intelligence. Business intelligence is a trendy term for getting reports out of SQL databases. Calling something “BI” makes it sexier and increases the profit margins for it, of course.

Conclusion

I could see every application in the supply chain space being driven exactly the way Smoothie works. If this were to happen, it would revolutionize the usability and sustainability of systems. It would have benefits in so many areas that it’s hard to account for all of them. In supply chain planning, it would allow for much more simulation.

This is greatly underutilized, partially because changing the master data in models is such a hassle and so time-consuming. It would improve master data maintenance because master data would be checked and validated in a very friendly format. Formulas could be used in different cells to copy over information from various product numbers (for example). The formula sheet would be the master sheet, and then the sheet could be exported as values when it is ready to be imported into the application. There are so many areas where applications could be improved; this is just one. However, heavy groupthink in the enterprise software market is preventing a great deal of innovation.

References

https://www.powerpivot.com/videos.aspx

https://www.mathworks.com/products/matlabxl/description2.html

*https://www.amazon.com/PowerPivot-Data-Analyst-Microsoft-ebook/dp/B003V8B4I0/ref=sr_1_1?ie=UTF8&m=AG56TWVU5XWC2&s=digital-text&qid=1301868074&sr=1-1