3 ways to make your API slow

I had to fix performance issues of one API endpoint. A pretty Symfony endpoint that gathers some data from database, assembling it to some structure and returns it as json.

Performance started being an issue when major part of that "some data" started to be 60000 entities. In worst case response time was almost 20 seconds. "Ok", I thought, "60k is a big enough number to make it slow". But trace showed that retrieving data from DB isn't a slowest part. There were things taking almost 1/3 of request time each. And these things were easy to fix.

N+1 requests are slow

We need to load and use related entities so we have a code like this:

$qb = $this->createQueryBuilder('a');

$qb
    ->select('a')
    ->innerJoin('a.entityB', 'b')
    ->where('a.foo = :foo')
    ->setParameter('foo', $foo);

$entities = $qb->getQuery()->getResult();

foreach ($entities as $entity) {
    $cId = $entity->getEntityB()->getEntityC()->getId();
    //...
}

It looks like we're joining the second table to make eager loading of related entity but we still make N+1 requests. That's because Doctrine doesn't create instances of related entity until we explicitly add that table to "select".

$qb = $this->createQueryBuilder('a');

$qb
    ->select('a, b')
    ->innerJoin('a.entityB', 'b')
    ->where('a.foo = :foo')
    ->setParameter('foo', $foo);

$entities = $qb->getQuery()->getResult();

//...

Now it works fine. Three chars removed one third of processing time.

Notice we're using EntityC in code but not joining this table. That's because actually we use Doctrine proxy entity. It stores EntityC id (from EntityB record) and makes DB query only when you try to get any another field.

Creating DateTime from string is slow

Now check this query:

$qb = $this->em->createQueryBuilder();

$qb
    ->select('f')
    ->from(Foo::class, 'f')

return $qb->getQuery()->getArrayResult();

Looks good. Hydrating 60000 objects looks slow so we're retrieving everything as arrays. Now it should be fast. Should be.

There's one nuance. Doctrine still hydrates every field you retrieve. That's pretty straightforward with all those strings and integers you have, but there are dates. To make every DateTime instance php has to run a function that tries to determine date format by looking at string and then to convert a string to internal representation of date and to time. That's not as slow as I'm trying to picture it. But making it 120000 times is slow. Making it 120000 and not using those objects is slow and sad.

So what's the deal? Fields "created_at" and "updated_at". Those ones are almost never used in code.

We're retrieving all fields of EntityA because we use almost all those fields. But still we should explicitly list fields we're gonna use.

$qb = $this->em->createQueryBuilder();

$qb
    ->select('f.bar, f.baz, f.spam, f.ham')
    ->from(Foo::class, 'f')

return $qb->getQuery()->getArrayResult();

Look at this query. We've removed half of processing time by updating one line. Easy!

Jms/Serializer is slow

Third one is hard to notice. We're using FOSRestController and just returning our data letting it to make response by itself.

Usually action looks like this:

use FOS\RestBundle\Controller\Annotations\View;

/**
 * @View()
 */
public function fooAction()
{
    //...
    return $result;
}

Under the hood FOSRestController uses Jms/Serializer - a pretty smart library for serializing stuff.

The problem is Jms/Serializer is smart even when it shouldn't. If you give it a huge multidimensional array - usually you just want it converted to json as is. But Jms/Serializer recursively walks along your array determining how to serialize every item one by one. With huge array that takes time. And that work is often useless because all you need is json_encode.

Let's update our controller a bit.

use Symfony\Component\HttpFoundation\JsonResponse;

public function fooAction()
{
    //...
    return new JsonResponse($result);
}

Now our endpoint takes much less than a second to work.

Notice, we didn't change any logic nor make any complex optimizations. All we did is removing unexpected time wasters. Now look into trace of your endpoint. And write in comments of you find something interesting there.

Tags: , , ,