Sunday, May 27, 2018

On Patterns Of Life: From MacroData to PicoData

Insight (or GeoInsight in our case) is lost in the deluge of data that we are acquiring today from everyday sensors that are machine or humanly generated. This project is a set of heuristic Spark based implementations to reveal signals from the movement of ships in and out of the Port of Miami.

The idea is to extract small clean data (PicoData) from the overlap of a massive amount of data (MacroData). The aggregation of "clean" PicoData derived from MacroData trust into the forefront patterns of life.

For example, given the following display of AIS broadcasts:

img-alternative-text

We can mutate the data to reveal the "clean" influx of ships into the harbor at high tide:

img-alternative-text

Like usual, you can download all the source code from here.

Monday, January 1, 2018

On ML and Elastic Principle Graphs

Happy 2018 all. It has been a while since my last post. Thank you for your patience dear reader. Like usual, the perpetual resolutions for every year in addition to blogging more are to eat well, often exercise and climb Ventoux.

Onward.

I genuinely believe that 2018 will be the year of the ubiquity of Geo-AI. It will be the year when Machine Learning and Spatial Awareness will blossom inside and mostly outside the GIS community.

We at Esri have had Machine Learning based tools in our "shed" for a long time. Every time an ArcGIS user performs a graphically weighted regression, trains a random trees classifier or detects an emerging hot spot, that user is using a form of Machine Learning without knowing it!

So one of my "missions" for 2018, it to make this knowledge more explicit to our users and non-traditional GIS users. Also, start to implement new forms of Machine Learning.

Machine Learning (ML), a branch of Artificial Intelligence (AI), is a disruptive force that is changing how today's industries are gaining new insight from their data. ML uses math, statistics and probability to find hidden patterns and make predictions from the data without being explicitly programmed. It is this last statement that is disruptive, "No explicit programming"! An ML algorithm iterates "intelligently" over the data, and the patterns emerge. Being iterative, the more data an ML algorithm is exposed to, the more refined the output becomes. Thus the coupling of BigData and ML is a perfect marriage fueled by cheap storage, ever more powerful computational power (think GPU) and faster networking.

This reemergence of this "No Explicit Programming" paradigm such as Deep Learning, Reinforcement Learning, and Self Organization is skyrocketing the likes of Google's AlphaGo-Zero, Facebook, and Uber.

So, I am starting this launch with something I have been fascinated by for quite some time, and that is "Elastic Principle Graphs."

It is a "deep" extension of PCA that I came across it during my research of mapping noisy 2D data to a curve and was fascinated by its self-organization.

img-alternative-text

After reading, (and rereading for the nth time) this paper, this GitHub repo is a minimalist implementation in Scala.

Happy New Year All.