top of page
Search

Taming Performance in a Real-Time Geospatial Application

  • 23 hours ago
  • 6 min read

ApptiGEO is a geospatial app where users organize everything — map assets, text notes, images, documents, videos, and more — in a single tree hierarchy. The tree sidebar is the backbone of our UI, tightly coupled with an interactive map on the right.


The challenge: how do you model 13 different content types under one tree, query them efficiently, and serve them over a real-time WebSocket — without the backend collapsing under its own weight?


Let’s start with the data model that made this possible.


The Problem: One Tree, Many Shapes

To streamline the process of asset management, our goal was to arrange the data so that it’s similar to the file system. Think of folders, files, Drag & Drop. All in one place, visually distinguishable, with shared functionality.

Geospatial map interface showing blue buildings and streets. Sidebar displays a house photo labeled "Picture" with coordinates in the footer.

A file system has one abstraction — the inode — that can point to a file, a directory, or a symlink. We needed the same flexibility: a single tree node that can point to any of 13 different model types (Asset + 12 DataItem subclasses). They share some fields but have wildly different storage.


An asset is the map-pinned container — it holds geometry, color, and tags. Data items are what lives inside: texts, images, PDFs, videos, chats, drone imagery, and 6 more types. One tree node, 13 possible shapes.


Django has a powerful contenttypes framework, that “can track all of the models defined in your app, providing a high-level, generic interface for working with your models”. It also gives us a GenericForeignKey, which simplifies creating truly generic relationships. Unlike a regular ForeignKey that points to exactly one model, with GFK we have the flexibility to point to any model.


Our GenericTreeNode stores two fields: a content_type (which model?) and object_id (which row?). That’s the recipe for how one tree node can reference an Asset, a TextDataItem, or an ImageDataItem — all through the same two columns.

Code snippet of a Python class, GenericTreeNode, defining tree nodes with fields like content_type and object_id. Dark background.

By the way, if you wonder what TimeStampedModel or Orderable are... Well, it’s an abstract base class that is another type of model inheritance in Django. We can use it to supply our models with common fields, methods, and properties, without creating an additional table in our database. Here the idea is simple and well-known: add created_at and updated_at timestamps in our models.

Code snippet showing a Python class 'TimeStampedModel' with timestamp fields in a dark theme editor. Comments describe a base model.

As for Orderable, it adds helper fields and database indexes for performing fractional indexing in a constant time.


Multi-Table Inheritance & django-polymorphic

Now, how do you query the base table and get the concrete type back? Let’s see MTI in action and understand its pitfalls.


Imagine a notifications service. Every notification has the following fields: id, user, message, created_at. But an email also needs subject and html_body, a push needs device_token, an SMS needs phone_number. You could cram nullable columns into one table or use separate tables and lose unified queries. There’s a third option.


In multi-table inheritance, Django creates one table for the parent, one per child, linked by an implicit OneToOneField. We include all shared fields on the parent (or the base class if you will), and specialized fields on the child.

Python code screenshot for Django models: Notification with fields user, message, and created_at. Inheritance includes EmailNotification, PushNotification, and SMSNotification.

Seems very straightforward at first, doesn’t it? But what happens if we try to retrieve all notifications? 🤔

Code snippet showing creation of email, push, and SMS notifications with Python. Text highlights different fields like message and user.

Plain MTI doesn’t recognize the underlying child classes. That’s when the

django-polymorphic library kicks in. In ApptiGEO, this is exactly what powers our data item layer:

Python code snippet defining Django models for polymorphic data items: DataItem, TextDataItem, ImageDataItem, and UrlDataItem.

Just by subclassing PolymorphicModel we can easily retrieve all inherited child classes (here, our lovely data items) and access their specific fields right out of the box:

Code snippet showing polymorphic queryset returning instances like "TextDataItem," "ImageDataItem," and "UrlDataItem" with comments in green.

That’s the power of automatic downcasting in action. It’s a nice DX improvement and simplifies working with polymorphic models. So, we got what we wanted: one tree, many shapes. Job done, right?

Well, not quite. We loaded 5000+ real client records, opened the app, and watched the tree endpoint that returns all existing nodes for a given workspace take 10-15 seconds to respond. ☠️


From 10s to 1s: Tree Optimization

We profiled the tree endpoint and realized that there was no single bottleneck. The slowness was coming from multiple layers, each adding overhead that compounded large datasets. Here’s what we changed.


Manual Dict Serialization

DRF’s ModelSerializer is convenient but expensive at scale — every field goes through individual to_representation calls, validators, and schema introspection. With 5000+ nodes, that overhead added up fast. We replaced the serializer layer entirely with a hand-rolled function that issues a single .values() query per content type (assets, data items, versions, authors) and assembles plain Python dictionaries directly. No field classes, no binding. The API contract stayed identical; only the path to produce it changed.


PostGIS AsGeoJSON Instead of Python Geometry Processing

Each asset carries a geometry column. Previously, Django loaded it as a GEOS Python object, and DRF’s GeoFeatureModelSerializer re-serialized it to GeoJSON in Python. For thousands of assets, that’s thousands of round-trips through the GEOS C library and Python’s json module. The fix was a single annotation: annotate(geojson=AsGeoJSON(“geometry”)) — which delegates the conversion to PostGIS. The database returns a ready-made GeoJSON string; we just orjson.loads() it into a dict. No Python geometry objects are ever constructed.


orjson Renderer

The final JSON encoding step was Python’s standard json module via DRF’s default renderer. We swapped it for a custom ORJSONRenderer backed by orjson – a Rust-implemented JSON library that’s typically 2-10x faster than json.dumps on real-world data, with native support for datetime, UUID, and Decimal types we were already using. No changes to the response structure.


Smarter Pagination

The old pagination ran a COUNT(*) query on every single page request — an expensive full-index scan that was re-executed even when the client just wanted page 2. We extended LimitOffsetPagination to count only on the first page (offset=0). For subsequent pages, it fetches limit + 1 rows instead: if a 501st row comes back, there’s a next page; if not, there isn’t. We also raised the default page size from 200 to 2000, so a typical workspace fits in fewer client-server round-trips.


Bypassing django-polymorphic

Remember how PolymorphicModel magically gives you back the right subclass when you query the base table? It does that by issuing extra queries (or big LEFT JOINs across every child table) to figure out the concrete type of each row. Lovely at REPL, painful at 5000+ rows. We dropped down to .non_polymorphic().values() for the bulk reads, which gives us plain dicts straight from the parent table with no child-table joins. When we do need a concrete child (e.g. for the detail view), we still go through the polymorphic manager. Best of both worlds: ergonomic models for everyday code, raw queries on hot path 🔥.


The Result

The combined changes brought the tree endpoint from 10-15 seconds down to roughly 1 second on the node client dataset 🚀. The benchmark runs mixed nodes (assets + data items) across 5 iterations and reports the speedup factor between the old DRF path and the new dict path:

Text showing a benchmark comparison for tree nodes. Old is avg 9312.3ms, New is avg 703.7ms, with 13.2x speedup. Black background.

Lessons Learned

The biggest takeaway is the obvious one in hindsight: profile before you guess. There wasn’t one bad thing slowing us down. Turned out, there were five problems at once — the serializer, the geometry pipeline, the JSON encoder, the pagination, and django-polymorphic's join strategy — each individually tolerable, collectively catastrophic on real data.


📚 A few things that stuck with us:

  • Convenience abstractions have runtime costs. DRF’s ModelSerializer and PolymorphicModel are both wonderful — until you’re paying their per-row tax many times in a single request.

  • Databases are good at database things. We were shipping geometries from PostGIS into Python just to serialize them back to GeoJSON. One AsGeoJSON annotation later, and PostGIS does in microseconds what GEOS + Python’s json module were doing in seconds.

  • Re-running COUNT(*) on every page is silly. Pagination felt like a solved problem until we realized we were doing a full-index scan on page 47 just to tell the client something it already knew. Count once, then stop.

  • The API contract and the serialization path are independent. We rewrote the entire pipeline behind the tree endpoint — serializer gone, geometry handled in SQL, polymorphic bypassed — and the frontend didn’t change a single line. Clients care about JSON, not about how you got there.


Summary - Real-Time Geospatial Applications

We set out to model “one tree, many shapes” — a single hierarchy that can hold 13 different content types, served in real time over WebSockets to a geospatial map UI. Django’s content types framework and django-polymorphic got us there elegantly. When real client data revealed the cost of that elegance at scale, a targeted set of changes sped up our main endpoint without touching the API contract.


The tree is just one piece of what we’re building at Apptimia. ApptiGEO is a full geospatial field operations platform: interactive maps, drone orthophoto processing, real-time collaboration, AI-powered asset tagging, and a growing task management layer — all organized through that same tree structure described here. If this kind of problem space interests you, get in touch with us!.


Adrian T., Senior Software Engineer at Apptimia

 
 
bottom of page