Dr. Mike Flaxman is at present the VP of Product at HEAVY.AI, having beforehand served as Product Supervisor and led the Spatial Knowledge Science follow in Skilled Providers. He has spent the final 20 years working in spatial environmental planning. Previous to HEAVY.AI, he based Geodesign Technolgoies, Inc and cofounded GeoAdaptive LLC, two startups making use of spatial evaluation applied sciences to planning. Earlier than startup life, he was a professor of planning at MIT and Trade Supervisor at ESRI.
HEAVY.AI is a hardware-accelerated platform for real-time, high-impact information analytics. It leverages each GPU and CPU processing to question huge datasets shortly, with help for SQL and geospatial information. The platform contains visible analytics instruments for interactive dashboards, cross-filtering, and scalable information visualizations, enabling environment friendly massive information evaluation throughout numerous industries.
Are you able to inform us about your skilled background and what led you to affix HEAVY.AI?
Earlier than becoming a member of HEAVY.AI, I spent years in academia, finally educating spatial analytics at MIT. I additionally ran a small consulting agency, with a wide range of public sector shoppers. I’ve been concerned in GIS tasks throughout 17 international locations. My work has taken me from advising organizations just like the Inter American Growth Financial institution to managing GIS expertise for structure, engineering and development at ESRI, the world’s largest GIS developer
I keep in mind vividly my first encounter with what’s now HEAVY.AI, which was when as a guide I used to be answerable for state of affairs planning for the Florida Seashores Habitat Conservation Program. My colleagues and I have been struggling to mannequin sea turtle habitat utilizing 30m Landsat information and a pal pointed me to some model new and really related information – 5cm LiDAR. It was precisely what we would have liked scientifically, however one thing like 3600 occasions bigger than what we’d deliberate to make use of. Evidently, nobody was going to extend my price range by even a fraction of that quantity. In order that day I put down the instruments I’d been utilizing and educating for a number of many years and went on the lookout for one thing new. HEAVY.AI sliced by means of and rendered that information so easily and effortlessly that I used to be immediately hooked.
Quick ahead a couple of years, and I nonetheless suppose what HEAVY.AI does is fairly distinctive and its early wager on GPU-analytics was precisely the place the business nonetheless must go. HEAVY.AI is firmly focussed on democratizing entry to massive information. This has the info quantity and processing velocity part after all, basically giving everybody their very own supercomputer. However an more and more necessary facet with the arrival of huge language fashions is in making spatial modeling accessible to many extra individuals. As of late, fairly than spending years studying a fancy interface with hundreds of instruments, you possibly can simply begin a dialog with HEAVY.AI within the human language of your selection. This system not solely generates the instructions required, but in addition presents related visualizations.
Behind the scenes, delivering ease of use is after all very troublesome. At present, because the VP of Product Administration at HEAVY.AI, I am closely concerned in figuring out which options and capabilities we prioritize for our merchandise. My in depth background in GIS permits me to essentially perceive the wants of our prospects and information our improvement roadmap accordingly.
How has your earlier expertise in spatial environmental planning and startups influenced your work at HEAVY.AI?
Environmental planning is a very difficult area in that you might want to account for each totally different units of human wants and the pure world. The final answer I discovered early was to pair a way often called participatory planning, with the applied sciences of distant sensing and GIS. Earlier than deciding on a plan of motion, we’d make a number of situations and simulate their constructive and destructive impacts within the laptop utilizing visualizations. Utilizing participatory processes allow us to mix numerous types of experience and resolve very advanced issues.
Whereas we don’t sometimes do environmental planning at HEAVY.AI, this sample nonetheless works very nicely in enterprise settings. So we assist prospects assemble digital twins of key elements of their enterprise, and we allow them to create and consider enterprise situations shortly.
I suppose my educating expertise has given me deep empathy for software program customers, significantly of advanced software program programs. The place one pupil stumbles in a single spot is random, however the place dozens or lots of of individuals make related errors, you realize you’ve obtained a design subject. Maybe my favourite a part of software program design is taking these learnings and making use of them in designing new generations of programs.
Are you able to clarify how HeavyIQ leverages pure language processing to facilitate information exploration and visualization?
As of late it appears everybody and their brother is touting a brand new genAI mannequin, most of them forgettable clones of one another. We’ve taken a really totally different path. We imagine that accuracy, reproducibility and privateness are important traits for any enterprise analytics instruments, together with these generated with giant language fashions (LLMs). So we have now constructed these into our providing at a basic stage. For instance, we constrain mannequin inputs strictly to enterprise databases and to offer paperwork inside an enterprise safety perimeter. We additionally constrain outputs to the most recent HeavySQL and Charts. That implies that no matter query you ask, we’ll attempt to reply along with your information, and we’ll present you precisely how we derived that reply.
With these ensures in place, it issues much less to our prospects precisely how we course of the queries. However behind the scenes, one other necessary distinction relative to shopper genAI is that we effective tune fashions extensively in opposition to the precise kinds of questions enterprise customers ask of enterprise information, together with spatial information. So for instance our mannequin is great at performing spatial and time sequence joins, which aren’t in classical SQL benchmarks however our customers use day by day.
We package deal these core capabilities right into a Pocket book interface we name HeavyIQ. IQ is about making information exploration and visualization as intuitive as doable by utilizing pure language processing (NLP). You ask a query in English—like, “What were the weather patterns in California last week?”—and HeavyIQ interprets that into SQL queries that our GPU-accelerated database processes shortly. The outcomes are introduced not simply as information however as visualizations—maps, charts, no matter’s most related. It’s about enabling quick, interactive querying, particularly when coping with giant or fast-moving datasets. What’s key right here is that it’s usually not the primary query you ask, however maybe the third, that actually will get to the core perception, and HeavyIQ is designed to facilitate that deeper exploration.
What are the first advantages of utilizing HeavyIQ over conventional BI instruments for telcos, utilities, and authorities companies?
HeavyIQ excels in environments the place you are coping with large-scale, high-velocity information—precisely the sort of information telcos, utilities, and authorities companies deal with. Conventional enterprise intelligence instruments usually wrestle with the quantity and velocity of this information. As an example, in telecommunications, you might need billions of name information, but it surely’s the tiny fraction of dropped calls that you might want to concentrate on. HeavyIQ lets you sift by means of that information 10 to 100 occasions sooner due to our GPU infrastructure. This velocity, mixed with the power to interactively question and visualize information, makes it invaluable for danger analytics in utilities or real-time state of affairs planning for presidency companies.
The opposite benefit already alluded to above, is that spatial and temporal SQL queries are extraordinarily highly effective analytically – however might be sluggish or troublesome to jot down by hand. When a system operates at what we name “the speed of curiosity” customers can ask each extra questions and extra nuanced questions. So for instance a telco engineer may discover a temporal spike in tools failures from a monitoring system, have the instinct that one thing goes fallacious at a selected facility, and test this with a spatial question returning a map.
What measures are in place to forestall metadata leakage when utilizing HeavyIQ?
As described above, we’ve constructed HeavyIQ with privateness and safety at its core. This contains not solely information but in addition a number of sorts of metadata. We use column and table-level metadata extensively in figuring out which tables and columns include the knowledge wanted to reply a question. We additionally use inner firm paperwork the place supplied to help in what is named retrieval-augmented era (RAG). Lastly, the language fashions themselves generate additional metadata. All of those, however particularly the latter two might be of excessive enterprise sensitivity.
In contrast to third-party fashions the place your information is usually despatched off to exterior servers, HeavyIQ runs domestically on the identical GPU infrastructure as the remainder of our platform. This ensures that your information and metadata stay beneath your management, with no danger of leakage. For organizations that require the very best ranges of safety, HeavyIQ may even be deployed in a totally air-gapped setting, making certain that delicate data by no means leaves particular tools.
How does HEAVY.AI obtain excessive efficiency and scalability with huge datasets utilizing GPU infrastructure?
The key sauce is actually in avoiding the info motion prevalent in different programs. At its core, this begins with a purpose-built database that is designed from the bottom as much as run on NVIDIA GPUs. We have been engaged on this for over 10 years now, and we actually imagine we have now the best-in-class answer on the subject of GPU-accelerated analytics.
Even the most effective CPU-based programs run out of steam nicely earlier than a middling GPU. The technique as soon as this occurs on CPU requires distributing information throughout a number of cores after which a number of programs (so-called ‘horizontal scaling’). This works nicely in some contexts the place issues are much less time-critical, however usually begins getting bottlenecked on community efficiency.
Along with avoiding all of this information motion on queries, we additionally keep away from it on many different frequent duties. The primary is that we will render graphics with out transferring the info. Then in order for you ML inference modeling, we once more try this with out information motion. And in case you interrogate the info with a big language mannequin, we but once more do that with out information motion. Even in case you are an information scientist and wish to interrogate the info from Python, we once more present strategies to do that on GPU with out information motion.
What which means in follow is that we will carry out not solely queries but in addition rendering 10 to 100 occasions sooner than conventional CPU-based databases and map servers. While you’re coping with the huge, high-velocity datasets that our prospects work with – issues like climate fashions, telecom name information, or satellite tv for pc imagery – that sort of efficiency enhance is completely important.
How does HEAVY.AI preserve its aggressive edge within the fast-evolving panorama of massive information analytics and AI?
That is an incredible query, and it is one thing we take into consideration always. The panorama of massive information analytics and AI is evolving at an extremely speedy tempo, with new breakthroughs and improvements occurring on a regular basis. It actually doesn’t damage that we have now a ten 12 months headstart on GPU database expertise. .
I feel the important thing for us is to remain laser-focused on our core mission – democratizing entry to massive, geospatial information. Meaning frequently pushing the boundaries of what is doable with GPU-accelerated analytics, and making certain our merchandise ship unparalleled efficiency and capabilities on this area. An enormous a part of that’s our ongoing funding in creating customized, fine-tuned language fashions that really perceive the nuances of spatial SQL and geospatial evaluation.
We have constructed up an in depth library of coaching information, going nicely past generic benchmarks, to make sure our conversational analytics instruments can have interaction with customers in a pure, intuitive method. However we additionally know that expertise alone is not sufficient. We’ve got to remain deeply related to our prospects and their evolving wants. On the finish of the day, our aggressive edge comes all the way down to our relentless concentrate on delivering transformative worth to our customers. We’re not simply protecting tempo with the market – we’re pushing the boundaries of what is doable with massive information and AI. And we’ll proceed to take action, regardless of how shortly the panorama evolves.
How does HEAVY.AI help emergency response efforts by means of HeavyEco?
We constructed HeavyEco after we noticed a few of our largest utility prospects having vital challenges merely ingesting at the moment’s climate mannequin outputs, in addition to visualizing them for joint comparisons. It was taking one buyer as much as 4 hours simply to load information, and if you end up up in opposition to fast-moving excessive climate situations like fires…that’s simply not adequate.
HeavyEco is designed to offer real-time insights in high-consequence conditions, like throughout a wildfire or flood. In such situations, you might want to make choices shortly and primarily based on the very best information. So HeavyEco serves firstly as a professionally-managed information pipeline for authoritative fashions similar to these from NOAA and USGS. On prime of these, HeavyEco lets you run situations, mannequin building-level impacts, and visualize information in actual time. This offers first responders the important data they want when it issues most. It’s about turning advanced, large-scale datasets into actionable intelligence that may information speedy decision-making.
Finally, our purpose is to offer our customers the power to discover their information on the velocity of thought. Whether or not they’re operating advanced spatial fashions, evaluating climate forecasts, or attempting to determine patterns in geospatial time sequence, we would like them to have the ability to do it seamlessly, with none technical boundaries getting of their method.
What distinguishes HEAVY.AI’s proprietary LLM from different third-party LLMs by way of accuracy and efficiency?
Our proprietary LLM is particularly tuned for the kinds of analytics we concentrate on—like text-to-SQL and text-to-visualization. We initially tried conventional third-party fashions, however discovered they didn’t meet the excessive accuracy necessities of our customers, who are sometimes making important choices. So, we fine-tuned a spread of open-source fashions and examined them in opposition to business benchmarks.
Our LLM is rather more correct for the superior SQL ideas our customers want, significantly in geospatial and temporal information. Moreover, as a result of it runs on our GPU infrastructure, it’s additionally safer.
Along with the built-in mannequin capabilities, we additionally present a full interactive consumer interface for directors and customers so as to add area or business-relevant metadata. For instance, if the bottom mannequin doesn’t carry out as anticipated, you possibly can import or tweak column-level metadata, or add steerage data and instantly get suggestions.
How does HEAVY.AI envision the position of geospatial and temporal information analytics in shaping the way forward for numerous industries?
We imagine geospatial and temporal information analytics are going to be important for the way forward for many industries. What we’re actually centered on helps our prospects make higher choices, sooner. Whether or not you are in telecom, utilities, or authorities, or different – being able to research and visualize information in real-time could be a game-changer.
Our mission is to make this type of highly effective analytics accessible to everybody, not simply the large gamers with huge sources. We wish to be certain that our prospects can reap the benefits of the info they’ve, to remain forward and resolve issues as they come up. As information continues to develop and turn out to be extra advanced, we see our position as ensuring our instruments evolve proper alongside it, so our prospects are at all times ready for what’s subsequent.
Thanks for the good interview, readers who want to be taught extra ought to go to HEAVY.AI.