Loading Spatial Data Into SQL Azure Part 1
I'm currently working on a university project in which we have elected to use a client-server architecture with an Android app on the front end and a web API working as the backend server. This server is being run on Azure's App Service as a .NET Web API project (since one of our members has a free pass for a few months to run anything they want through Azure) and we're using an automatically configured Azure SQL database to host all our data management. So far, so good.
In fact, it's really cool. Everything integrates really well, especially with Visual Studio, and the speed with which it runs is also a nice feature. The comfort factor of being able to code in C# but also run a nimble cloud-based service is a big plus.
One big challenge we faced was that we need to be able to identify a user's LGA (it's an Australian Thing; see here) and get certain identifying data about the LGA based on the user's current latitude and longitude.
As per usual, the Australian government makes it difficult for people to get access to their data in a useful format. They don't have their own API to do this lookup. I ended up Googling how to determine if a point is in the boundaries of a shape; turns out you can indeed do this in MSSQL using spatial functionality - which Azure supports (yay!).
Here's where the tricky bit starts. The ABS (Australian Bureau of Statistics) provides access to shapes of the LGAs but only in a monolithic file (read: all of Australian LGAs even though I want ones from NSW for this project) and in the annoying .shp (read: shapefile) file format which seems to be vaguely proprietary. Yay.
Luckily using a bit of Google-fu I came across this guy's blog. He wrote an awesome little project that would let you import these 'shapefiles' into Azure SQL! Score! Unfortunately I had to build and work out how to use it myself, fiddling around with C# and reading the data in the .shp file, but eventually I got it all out and it's exporting to Azure right now.
It's slow to load up and there are around 570 polygons to export to Azure if you want to utilise the same dataset (find it here), so only do it when you have a fair bit of time to spare. Alternatively leave it running overnight.
I'll probably play around with it in the next few days to give it a more exponential back-off and retry process, and maybe a resume function since it keeps killing my internet connection - I assume the Web requests are running into the millions given the intricate data requirements of polygons.
In the future I might just set up an instance of Azure app service myself to use this data depending on the cost of running an app service instance and database.