Data Wrangling

I really like the term “Data Wrangler”.   It’s the best title I can give to those people in charge of working with large amounts of data that is often messy and poorly formatted.  The proper term is “Data Analyst” and more and more it seems to imply dealing with large amounts of data.  Big data.  I disagree.  Most college educated (or higher) people should be comfortable with dealing with well structured data files. That’s not a Data Wrangler. High school students should have enough skills to view most open data from any government site.  There’s nothing special about an excel file or a csv file.  When your cleaning 1oo, maybe two1oo, M3ssy, r0tt1n f1lez 000z1ng with typos, that’s data wrangling.

Listening to CBC Radio’s The Word At 6,  I heard them say that putting data in spreadsheets and text files makes it too difficult for the public to view the data (of course I can’t find the podcast’s link, probably erased).  They were referring to government requirements to monitor oil sands activity in Alberta.

How else would you present scientific data to the public?  Word art?  Seriously, I think tabular format is the ONLY way to present the data.  Any other way would diminish the data’s value.

The Pot Calling the Kettle Black.

Digging, I found the news item:  Alberta oilsands monitoring needs to be clearer to public: review but don’t look there for any answers to WHO.  The report quotes someone from the USA and refers to the “independant review” as if we all know what that means.  WHO???

Google saves that day.  I think (still not sure) that they were referring to Alberta Environmental Monitoring, Evaluation and Reporting Agency and this news release found here.  The panel, commissioned by Environment and Climate Change Canada (ECCC) and Alberta Environment said that more accessible data should be made for public viewing.  So I checked out , the Canada-Alberta Oil Sands Environmental Monitoring Information Portal.

Well, the data was all there, including an interactive map and web-visible file directories of all data.   I think there is more they can do with the interactive map, but the data’s there if anyone wants to do better, have at it. I think I may take up that offer with ArcGIS Online.

However, when I checked out the “independant” body of panel experts to find out what they’ve done, many  scientific articles were in formats where the raw data was inaccessible, such as pdf, and most were found behind a paywall! How ironic.