Concurrency and Core Data

In my app the json of the data is downloaded and the database is populated locally. as long as it comes to populating individual entities, that’s okay. The problem arises in setting up the data relationships between entities.

to make you understand I will explain my real situation.
there are 3 tables

  • Lemma
  • LemmaWriting
  • Lemma Meaning

the relationship of these tables is 1-N
that is, 1 Lemma has many Writing Lemmas and many Meaning Lemmas

Since NSBatchInsertRequest cannot set the relationship (unfortunately), I am using the performBackgroundTask method. In this block through NSFetchRequest I recover all the lemma and in a for loop for each lemma I make the fetch request to retrieve the Lemma Meaning connected by the foreign key.
here is the code:

container.performBackgroundTask {context in
            context.perform {
                do {
                    let fetchWrite: NSFetchRequest <WriteLemma> = WriteLemma.fetchRequest ()
                    let lemmaFetchRequest: NSFetchRequest <Lemma> = Lemma.fetchRequest ()
                    let lemmaArray = try context.fetch (lemmaFetchRequest)
                    for lemmaItem in lemmaArray {
                        fetchScritura.predicate = NSPredicate (format: "idLemma ==% d", lemmaItem.id)
                        let writeList = try context.fetch (fetchWrite)
                        lemmaItem.scritture = NSSet (array: scriptList)
                    }
                    print ("I've finished realationship")
                    dispatchGroup.leave ()
                    
                } catch let error as NSError {
                    print ("Could not fetch. \ (error), \ (error.userInfo)")
                }
            }
        }

The problem is this: the Lemma table contains about 7300 elements and this cycle extends the whole population to almost 2 minutes. but the thing that intrigued me the most is that in the same database I have another same situation but the only difference is the number of elements: only 2100. but the time difference is enormous. because in the case of the 2100 elements the whole cycle is completed in about twenty seconds.

So my problem is: how can I improve this piece of code to get the job done in less time?

I thought about splitting the Lemma array into chunks and then starting a thread for each chunk. but i’m having trouble doing it and i don’t know if the idea is good.

do you have any suggestions, advice?

I wasn’t able to really follow your code as it seems there are some details missing. My thought was that you might be thinking about core data wrong as you are referring to things like tables and foreign keys. Is there a way to set up the entity relationships at the point that you create each lemma?

What’s taking all the time is the fetching (text searching) you have to do to find the elements in order to re-establish the relationships. I had this same problem ten years ago. I was hoping Apple would create a way to save a CORE data object as a flat file with relationships intact for export, sharing, archiving and importation. I ended up doing what Apple advised not to do, save a copy of the CORE object itself as a data file, then check and correct it when unpackaging and re-initializing it. It worked, but I knew I was taking a risk that Apple might change something that would break re-importation of a legacy archive. So, I maintained a CORE data version translation algorithm to check the version and make any adjustments so that the object complied with current CORE data configuraton specifications. I instructed my users to always use the lastest version of the app to insure importation of archived databases would work.

The expense is almost certainly the fetch you do inside the loop. Every time you fetch, it is expensive. Try reducing the number of fetches you need to do by gathering together a batch of objects, extracting the ids you need into a set, and then doing a single fetch with predicate of the form “id IN %@“ passing the set you need.

As a first attempt you could even forget the batching, just fetch all the objects you need in one fetch, organize into dictionaries, and then set the relationships with no fetching at all.

One more thing to check: does the id in your predicate have an index in the model? Fetching without an index on the attribute is very slow.

As a guide, I would think you could get this code down to less than a second, a couple at most. 20s is already way, way too long.

Thank you very much for your answers and suggestions. However, I believe I have found the answer to the problem. as said by @drewmccormack the fetch inside the cycle is the one that weighs the most (and this was already known). What I have done is to apply what I learned from RayWenderlich’s “Concurrency by Tutorial” book and use NSAsyncronousFetchRequest instead of its synchronous version NSfetchRequest. I used it for both LemmaScritture and LemmaSignifica and the timings dropped to 10-15 seconds. Fantastic! But while I develop the application I will check the consistency of the data. if everything went ok then that is the best solution!