Iosconfsg tim oliver header

Realm: How I Learned to Love Databases Again

iOS provides two system level database frameworks: SQLite and Core Data. While both of these frameworks are great at what they do, they both have high learning curves and require a fair amount of boilerplate code in order to get started. Realm is a new database framework wholly designed from scratch to make persisting data easier than that. In this iOS Conf SG talk, Tim Oliver introduces Realm and contrasts it to his experiences with SQLite and Core Data. He also demonstrates how to get started coding with it, best implementation practices, and how to migrate an existing app.


Introduction

I’m Tim Oliver from a company called Realm. I will share with you the story of my journey with all the databases that iOS has for local storage, up until Realm. I’ll give an overview of how the API works and what our methodologies are. Lastly, I’ll introduce the new hotness we recently announced, the Realm Platform.

My History with Databases

I’m from Perth, Australia (which is in the same time zone as Singapore). At the moment I’m hanging out in San Francisco. I studied joint Computer Science and Multimedia at university in Perth. That was an interesting experience, because it was like doing database design for one half of the week and then Photoshop and Flash animations the other half of the week. I did web development until 2013.

I moved to full time iOS development in 2013 and spent a year in Japan working at an iOS company there, which was good fun and hard in certain languages. I love building apps in my free time and I like open sourcing the components of my apps to give back to the community. I had a nice gentleman come up and tell me he contributed a PR to one of my open source libraries (thanks for that, Kevin). And I also love karaoke and video games.

Back in 2011, I made an app called iPokedex, which is basically a completely offline index of every single bit of Pokemon data you can have. It’s very useful when you’re on the train playing Pokemon on your DS, and you need to know what level Pikachu learns Thunder at (you don’t evolve him into Raichu before he learns that, because Raichu can’t learn Thunder).

I was thinking about using JSON, but JSON would have been terrible because the Pokemon dataset is not small. I decided on SQLite. I aggregated a bunch of data from a huge number of Pokemon websites into a SQLite file and installed that as a read-only bundle in the app.

SQLite

This was a long time ago (iPhone OS 3 time), there was no GCD, it was all NSThreads and no ARC. I implemented a model that would do a query of SQLite, and then pull that data out and put it into a manual NSObject subclass.

But because SQLite is a copy operation, you have to be very careful about how much data you have in memory. Back on the iPhone 3G, you had only 128 MB of RAM to work with. I had two versions of each object, a light one for table views and then a detailed one for actual detail view controllers. There was a ton of data: the Pokedex text is 20 MB across every game.

SQLite has tools where you can generate databases from scratch. This is what the internals look like (many tables and data). But the app worked, and it was pretty performant on an iPhone 3G. I had a few takeaways from SQLite from that experience.

Because SQLite is a C library, it’s always necessary to have some bridging code that will make the experience of using a C library in Objective-C and Swift nice. You can write that yourself, or there are many on GitHub as well.

SQL is not a normal skill you need for writing iOS apps. You need to know what a primary key is, what a foreign key is, how to index columns properly, how the structure of SQL language works (e.g., “select * from database”). There’s another level of skill and knowledge you need. And you have to take data out of a query and then map it to an object in order to work with it.

To have a table view, I need an array of objects, and there’s effort to map data from a SQL query into a usable format. Also, because you’re copying data from disk to memory, you have to worry about how many entries are in memory at once. You need pagination most of the time.

Get more development news like this

Lastly, SQLite is very manual: you have to execute a query to generate the database, and then if you want to change the schema, that’s another set of queries you have to execute. It works, and it is a pretty decent solution, but requires time.

Core Data

The next app I made was a DRM-free comic book reader called iComics. The idea of this app is, you have all those comic book apps that are walled gardens (Marvel and DC) where you have to buy and read their comics inside the app, but there are other comic distributors on the Internet (e.g., Humble Bundle, drivethrucomics.com) where they give you the comic, but there’s no complementary app you can use to read it on your iOS devices. In this app any comic can be added and read.

I used Core Data for this originally. I needed some data persistence solution, because much of the time a comic book is a ZIP file with JPEGs, and transforming that into a usable medium requires pre-processing. Most importantly, pages are not organized in a ZIP file: you have to manually go through and generate the list (Page1.jpeg, Page2.jpeg, etc.) and that needs to be stored somewhere.

Core Data was touted as being like SQLite but requiring less effort. It gives you managed objects so you don’t have to do any manual copying between SQLite and your objects. Schema migrations are done for free, and you don’t have to know SQLite. You can do native NSPredicate fetches to get a good experience with less effort. And, of course, it’s a native framework supported by Apple and proven to be bulletproof.

My experience was a year of pain and suffering. It was not fun, it was not a good experience, Core Data is pain. It was a lot of effort to learn. You have to know NSPersistentStoreCoordinator and NSManagedObjectContext and NSManagedObject. You have to know what an xcdatamodel is before you can write any code. Then you get to write code to make all those elements work together. Much effort before you can even start saving stuff to disk. If anything goes wrong, because it’s an Apple framework, you might get a cryptic error but you have no idea what is going on.

You’re not getting any benefits because it’s built on SQLite. You still have the memory copy issues; you still have to worry about 14 things into memory and out again, and making sure that you’re not going to overrun your available memory. Moreover, if the automatic schema migration of Core Data does not work, it half blows away your database–you have a half working database, and that makes the app perform in ways I never thought it would be able to.

I had a shipping version on the App Store, and I had a new version in beta, and all the beta testers were saying, “What did you do? This thing is terrible, it’s doing weird crashes, and half of my comics are there; the comics are there, but the thumbnails are gone.”

I was spending more time debugging Core Data than I was writing code. I was close to going back to SQLite, knowing it would be more effort but would at least be a bit more stable… when I heard about Realm.

Realm

I went to the Realm website. Because databases are in the background, they’re not really in the front facing of the UI. I thought, “My comics app is now unshippable. I’m going to tear Core Data out and stick Realm in it. I’m sure Realm cannot be as bad as Core Data.”

To do that, I set aside one evening to convert a completely Core Data app to Realm. I changed all the NSManagedObjects’ subclass names to RLMObject names. I removed all the NSNumber properties in my managed objects because Realm does native Ints and CGFloats and Doubles and Booleans. At that point the code stopped compiling because there was obviously a huge mismatch.

There was the matter of going through every single build error and figuring out how to replace the Core Data fetch and save operations with what Realm has. I didn’t need to do this, because I knew the data could be regenerated next time the app was opened, but some people have implemented ways to, without using NSManagedObjects, pull the data directly out of Core Data and move it to a Realm instance. Straightforward, and there are good tips on the Internet for how to do that.

It took three hours. I hit Build and it worked. After that, I joined the company. And it has been amazing ever since.

Realm Database

Some people think Realm is SQLite… and that is a huge misconception. Realm is a completely custom C++ engine designed by two engineers who used to work for Nokia. They took it to Y Combinator and made a company out of it. As a result, it’s not structured like SQLite at all. It treats everything inside it like objects. You’re not doing any data transformation between how you’re working at it with your own code and how it’s saved on disk. It’s objects all the way through.

Another cool thing is that Realm uses memory-map techniques to provide a zero-copy experience. When you have an object in your own code, you’re not working with a copy of your object. You’re using a pointer that is pointing straight to the entry; memory maps to disk. You’re getting a very fast experience because there’s no overhead of copying and there is no actual memory consumption.

While it is a C++ core, we provide Objective-C and Swift native frameworks. We eliminate that SQLite feel of needing to implement your own bridge. It uses cool internal Objective-C runtime reflection techniques to do method swizzling and property access in the background. There’s not much you have to do, because it does smart things in the background to set itself up.

And Realm is completely free and open source. The core and all the bindings are available on our GitHub account under the Apache license.

So what does a Realm “hello world” app look like?

Hello World!


import RealmSwift

// Define our object model
class Dog: Object {
   dynamic var name = ""
   dynamic var age = 0
}

// Create a new instance to save
let myDog = Dog()
myDog.name = "Rex"
myDog.age = 4

// Save it to Realm
let realm = try! Realm()
try! realm.write {
   realm.add(myDog)
}

There are four important steps. First, you import the header. Next, you define an object, and this gets saved to disk. This is an encompassing object and every property is persistent. The dynamic keyword is very important, because that’s required to work with the Objective-C reflection system.

The next step is you create a new object from scratch. And then you fill out the data. At the bottom, you reference an object called realm, and you open a write transaction, and you then add that Dog reference to the realm. That’s all you need.

If you were to execute that and look at what was written to disk in the Documents directory of your app, there’s a file called default.realm. We provide an introspection tool called the Realm Browser, and you can see the properties we put up were saved to disk.

Updated October 19, 2017: With the launch of Realm Platform 2.0, Realm Browser for MacOS has been deprecated, and is replaced by Realm Studio for Mac, Linux, and Windows to manage both local and synced Realms.

That’s a lot less code than Core Data, which would have been at least 100 lines to get started. And we were only working with two objects: the Realm class and the Object.

The Realm class represents .realm files on disk. It’s a context for reads and writes to disk. It can be configured; if you don’t want it to be saved to the Documents directory, there’s a Configuration object that you can pass into the Realm instantiation method, and you can save it to wherever you want. You can make it a cache, save it to the caches, or application support.

Also, you don’t need to save your own reference to Realm. If you keep calling that one on whichever thread you are calling it on, it internally recycles pre-created objects. It’s very efficient and makes you write less code in order to reuse the same objects.

The Object represents a single entry in your database. It supports all the major data types: strings, most types of Ints, Int32, 64, Doubles, Booleans, data. On the iOS side, it supports NSDates as well. Like Realm, it’s strictly thread-confined; you cannot move these things across threads.

Realm strictly enforces this write transaction model. Once you’ve added an object to a realm, you need to make any changes to it inside a write transaction. This ensures that there are no possible conflicts. Only one write transaction can be open at one time. Any others in the background will be paused and made to wait until the first one is complete. We can thus guarantee that writes will be atomic and everything will be sent to disk properly.

Getting the data back out again is also a small amount of code.


import RealmSwift

// Create the reference to the Realm on disk
let realm = try! Realm()

// Fetch all the dog objects in Realm
let allDogs = realm.objects(Dog.self)

// Get just the first dog
let rex = allDogs[0]

// Access the dog's properties
print(rex.name)

You get the same reference to the Realm object, and then you grab all the objects of that particular type, and then you grab the one you want.

The great thing about Realm is, because it uses that zero-copy paging mechanism, you’re not grabbing every Dog object on disk and bringing it in. You’re making a weak reference to it and then pulling in the one you want. And this can be refined further.

You can chain that all-objects method with filter and sort, and filter uses the NSPredicate string format. You can do nice queries to pull out all of the data you want in one transaction. When you do one of those objects calls, what you get is called a Realm Results object. They behave exactly like an Array. You can iterate them and pull objects out like an Array. And when you pull an object out, that’s when it gets loaded.

You can have thousands of objects in the Results object, but if only the first one gets touched, Realm is very efficient in the way it uses memory. Again, it’s strictly thread-confined. Any object that pulls data from a realm will have that realm as a parent property in it. Every realm only works in one thread at a time, so any child objects are constrained to the same thread as that object. And it’s a zero-copy architecture. It tries its best to point things to its properties on disk. It’s very straightforward and very easy, and it uses memory mapping.

Back in the old days, sometimes we were constrained by the read speeds of SSD or the NAND flash. These days, especially with the later iOS devices, we’ve found that parts of our own code have been bottlenecks because NAND flash has become fast. You’re not getting any performance hits with this memory mapping technique. As a result, you don’t have to worry about pagination. Unlike SQLite or Core Data, you can grab it all and then work on it as you want, and it’s nice, clean, minimal code.

The coolest thing about Realm is live updates. This makes it less code if you query for an object, and you have a Results object, and then make a change subsequently.

For example, if you make a query for an object, a dog of age less than two, and it’s not in the database, you will get an object–it will not be nil, but it will be a count of zero. If you add one and try the exact same object without refetching or refreshing or anything, it will automagically update. Hence you do not have to worry about manually polling for changes in the background. When something happens, you will be notified of it straight away.

If you want an explicit notification call–for example, if you’ve batched this to a table view and you need to update the table view–we provide different ways to register these notifications. There’s KVO, which is also good for functional reactive programming; a general one that gets called every time a change happens to a realm; and a more fine-grained one which we can bind to a table view, so as the data of a table view is changing, you can animate the table view appropriately.

Best Practices

Realm is not like SQLite or Core Data. If you’re using Realm, take advantage of live objects. Don’t implement any refreshing logic or requerying. You don’t have to worry about any re-fetch code. And pagination is not necessary.

Some people try to do their own custom filters–they’ll copy results to an NSArray or an Array. That’s not a good idea: you’re pulling everything into memory, because you’re accessing it all and then hanging onto it with a strong reference. We recommend creating the filter you want with a Realm Results object; you’ll get much better performance. Use your own NSNotifications, but we recommend you take advantage of the notification system to be properly alerted when a realm has changed and you want to update your UI.

Write transactions are a simple closure, realm.write, but we recommend you minimize them as much as possible. For example:


for item in results {
   try! realm.write {
      item.value = newValue
   }
}

In this code, you’re going through a full loop and modifying your value in each iteration of the for loop. But because the Realm write transaction is inside the for loop, you’re opening and closing a write transaction every single time the loop iterates. If you have 1,000 objects, that’s 1,000 write transactions. It’s more efficient to batch them together.

In this case, it’s easier if you put the loop inside the write transaction.


try! realm.write {
   for item in results {
      item.value = newValue
   }
}

Now there’s only one write transaction, so the write to disk will be more efficient. If you code this wrong, sometimes the file size of the realm file expands, because there are so many write transactions hitting the disk.

It is only possible to have one write transaction open at once. If you have a big background operation and you try to write on the main thread while the background operation is happening, that will block the one on the main thread and your UI will freeze. It’s not always possible, but if you can, try to do as many writes as you can on the background thread. Also try to keep them as small as possible so you’re not having these giant blocking operations. You just have to recreate the same Realm instance on the background thread, and you’ll get the same setup as on the main thread.

Realm objects and any child objects are NOT thread-safe. They’re confined to a single thread to ensure that atomic rights are maintained. There is an internal list where every single thread has its own unique Realm instance. If you want to pass objects between a thread–for example, if you create a dog object on the main thread, pass it to the background thread, and then try and access a property–it will trigger an exception straight away.

Instead, we recommend you use Realm primary keys to uniquely identify Realm objects.


class Dog: Object {
   dynamic var id = UUID().uuidString
   dynamic var name = ""
   
   override static func primaryKey() -> String {
      return "id"
   }
}

let doc = try! Realm().objects(Dog.self).first
let dogID = dog.id

DispatchQueue.global(attributes: [.qosDefault]).async {
   let backgroundDog = try! Realm().object(ofType: Dog.self, primaryKey: dogID)
   print(localDog.name)
}

In this case, we have an ID property with a UUID string. There’s another method called Realm get object for primary key, and you can pass that primary key across the thread and then re-query it on the secondary thread. And because the primary key property is indexed, it’s very fast to look this up. This works quickly and is the best way to offload work to background threads.

Introducing the Realm Platform

The new hotness is the Realm Platform. We live in a world now where people assume that data is ubiquitous. You make a change on your iPhone and you want the same change to appear on your iPad. Most users expect this because Apple’s said iCloud makes everything synchronized. But in reality, it’s still a lot of effort for developers. The Realm Platform aims to make data ubiquity very easy, and to make it seamless and “just work” for users.

How do we detect remote changes?

We have the mobile database component, and now we have the Realm Object Server. That’s a binary that lives on a server. It can be Azure services or Amazon or DigitalOcean–basically anything that can run Linux or MacOS.

You can have two completely discrete devices making changes to one database. When you write to the database locally, those changes are transmitted over WebSockets to an equivalent-data Realm file on the server. We are trying to work with a complete data parity across devices and the server.

It has complex conflict resolution. Even when one device loses connectivity, it still keeps persisting changes. Those changes are buffered to the disk and still made, and once the network connection comes back, those changes are then transmitted back up and then the data reconciliation merges those things on the other device without any conflicts.

We are trying to assure the concept of a REST API. A REST API is necessary because the way data is saved on devices is usually different from how it’s saved on the server. You need some way to abstract the data between those two points.

Usually the model is: you pull the data out of your local store, change it to an abstract format (i.e., JSON), transmit it up, reserialize it into whatever format you need, and then save it to your online store. In Realm, if the way the data is being saved is exactly the same on the server as it is on the local device, all you need to do is transmit the deltas of what changed. It doesn’t even have to be JSON; it can be binary data, and that can be compressed efficiently. You get a nice, automatic way of persisting data with less coding.

REST APIs are still needed, but at the same time, when you have a very simple model of transmitting data between a server and a device, if you don’t need that extra data abstraction, this is an efficient way of doing it.

Hello World! (RMP Edition)


import RealmSwift

let user = ... // Authenticate for a valid user object
let URL = URL(string: "realm://api.myserver.com:9080/~/user")

let configuration = Realm.Configuration()
configuration.syncConfiguration = (user: user, realmURL: URL)

// Create a new instance to save
let myDog = Dog()
myDog.name = "Rex"
myDog.age = 4

// Save it to Realm
let realm = try! Realm(configuration: configuration)
try! realm.write {
   realm.add(myDog)
}

There’s not much in the API: it’s maybe two extra lines of code. When you’re working with the Realm configuration object, you authenticate a user. We provide a different series of mechanism options (e.g., username/password, Facebook authentication, Twitter authentication, iCloud user identification). You can authenticate a user when you know that this is their device, and then all you have to do is hook it up to whichever server you have the Realm objects running on, and that user.

From that point onwards, if you do any write transactions, the data is automatically sent up in the background. You do not have to do any manual coding or data transforms. Obviously, the recipient device has to know a change happened. We do it the exact same way as we have been doing it from background thread updates. You can use the same KVO, the same general notification blocks, the same fine grained notifications, to be alerted when a change has occurred from another device.

If you want to try it out, this is all on our website. We have many tutorials and descriptions on how to get started. There’s a free version for developers. The enterprise version gives you a bit more online server access. The developer edition gives you the synchronization capability on device, with the assumption you can do any business logic you want on the device itself. And we are still tailoring the enterprise edition at the moment.

Questions?

Q: How scalable is the Realm object server?

Tim: We have benchmarked it up to at least 10,000 users. We’ve done a variety of internal tests. That’s just one node, and we’re working on ways to introduce a load balancer at the front, so you could then break up the load to different servers.

Q: Can you do sharding and clustering to break it up to different servers?

Tim: We do not have sharding at the moment, but we are working on that.

Q: Many of your problems with SQLite and Core Data seem like they were due to best practices not being formed yet. Have you tried things like MagicalRecord or any sort of abstractions around Core Data, and have you tried Core Data, again with more modern practices?

Tim: Yes. I think version 1.0.1 of my app was raw Core Data on my own, and I had a Twitter meltdown at that point, and a lovely guy called Tony Arnold recommended I try MagicalRecord, so I tried that. That fixed 80% of the problems, but I think the biggest problem with Core Data is that it’s not thread safe and it never lets you know it is not thread safe.

Even with MagicalRecord, it was getting frustrating that I was still having troubles, even though I was following all the tutorials, I read the Apple documentation, and I thought I was doing everything in best practice. At that point I said, “Is it me? Am I a bad programmer, or is this a bad framework? I’m spending so much time trying to make this work, I feel there has to be a better way out there.”

Q: Are there any best practices around using Realm objects for writing a document where you’re updating text as the user is typing it? Like in a notes application where as you’re writing, you want to keep saving the update note?

Tim: When you have writes, are there any best practices around that? Our demo shared-drawing app has a ton of writes. Every time the Apple pencil’s going over the screen, every frame is saving the X and Y coordinates of where the pencil was at the time and writing that to Realm. That was a combination of a ton of writes happening concurrently and the conflict resolution merging those two changes together.

Next Up: Realm Everywhere: The Realm Platform Reaches v2.0

General link arrow white

About the content

This talk was delivered live in October 2016 at iOS Conf SG. The video was transcribed by Realm and is published here with the permission of the conference organizers.

Tim Oliver

Tim Oliver hails from Perth, Australia! He has been an iOS developer for 6 years, and recently joined Realm in March 2015. Tim has a cool app called iComics and he loves karaoke! He does, in fact, also sometimes have the problem of too many kangaroos in his backyard.

4 design patterns for a RESTless mobile integration »

close