Tag vite :

Velibs, Pomm and Silex

 Posté par greg le 11 juillet 2011 14:41 |  Les commentaires sont désactivés pour ce post.
Tags : geek  computing  postgresql  dev  php  silex  pomm 

Pomm + Silex + Velibs

I want to ride my bicycle

Paris has changed since the apparition of Velib’s stations. Wherever you are, there is a velib station near you where you can take a bike, ride to your destination and find another station to let your bike. But as soon as you are a velib user, there will be questions that will come over and over again: «will there be any bike left at the nearest station?» and if not «what are the nearest stations with bikes available?». Once you arrive at destination, you obviously find the station to be full: «where is the nearest station with a room for my bike?».
The good news is Velib’s data have been open to public access via a web service, so it is now possible to create a small application that will free your mind of all that questions. That what Marc Weistroff made after he created nice APIs to access the web service.

Pomm and the database side

Data are divided in two entities:

  • velib_station position, name etc as VelibStation entities
  • velib_station_data availability, time etc as VelibStationData entities

Our database is fed by a webservice, the first thing we thought about was to create the same schema in the database. But, the ws schema gives each station’s coordinates as two columns “lat” and “long”. Using postgresql’s type «point» would allow us to query stations on distance between each other, which is a great feature. So we had to register a new converter to the database and overload the VelibStation convert() method to hydrate our instances correctly:

<?php # Model/Pomm/Database/VlibDb.php

namespace ModelPommDatabase;

use PommConnectionDatabase;
use PommConverterPgEntity;
use PommConverterPgPoint;

class VlibDb extends Database
{
    protected function initialize()
    {
        parent::initialize();
        $this->registerConverter('Point', new PgPoint(), array('point'));
    }
}
<?php # Model/Pomm/Entity/Vlib/VelibStation.php

namespace ModelPommEntityVlib;

use PommObjectBaseObject;
use PommExceptionException;
use PommTypePoint;

class VelibStation extends BaseObject
{
    public function convert(Array $data)
    {
        $x = 0;
        $y = 0;
        $tmp = array();

        foreach($data as $key => $value)
        {
            if ($key == 'lat')
            {
                $x = $value;
                continue;
            }
            if ($key == 'lng')
            {
                $y = $value;
                continue;
            }

            $tmp[strtolower($key)] = $value;
        }

        $tmp['coord'] = new Point($x, $y);

        $this->hydrate($tmp);
    }
}

With that setting, we can now query on a particular station and ask postgres to return the 5 closest stations around !

<?php # Model/Pomm/Entity/Vlib/VelibStation.php

public function findNearest($id, $limit = 5)
{
    $sql = <<<EOSQL
SELECT 
    %s, 
    (coord(a) <-> coord(b)) AS distance 
FROM 
    vlib.velib_station a, 
    vlib.velib_station b 
WHERE 
        a.id = ? 
    AND 
        a.id != b.id 
ORDER BY 
    distance ASC 
LIMIT %d
EOSQL;

    $sql = sprintf($sql, join(', ', $this->getSelectFields('b')), $limit);

    return $this->query($sql, array($id));
}

Let’s pay attention on how we wrote that query. The velib_station “a” is the velib station we do query, velib stations b are the nearest we want to find, this is why we do specify the “b” alias in the getSelectFields method call. We do specify we want “a” and “b” to be different and we use the “<->“ distance operator for the Point data type to sort our results by distance. This assumes earth can be considered as a plan on such small distances.

station’s data

Of course, the most important thing are not stations but their data: how many available slots are there ? These data are retrieved every once in a while (let’s say 5 minutes) by the server. Of course, to be able to display a nice graph of slots availability we need all rows within the last 24 hours. But the most important piece of information here is the last data we had. There are several ways to retreive that row.

The ugly limit

The first thing most developers think when they are in this situation is:

SELECT
    s.*,
    d.*
FROM
    velib_station s 
        RIGHT JOIN velib_station_data d ON d.station_id = s.id
WHERE
    s.id = ?
ORDER BY
    d.created_at
LIMIT 1

This is ugly because LIMIT should be used for formatting purposes only.

The great window function call

Using a window function can help us sort data by rank over their creation time:

SELECT
    s.*,
    d.*
FROM
    velib_station s,
    (SELECT data.*, rank() OVER (ORDER BY data.created_at DESC) AS rank FROM velib_station_data data WHERE data.station_id = ?) d
WHERE
        s.id = ?
    AND
        d.rank = 1

The awesome entity type

There is a solution that present better read performances for a bit bigger writing cost. Here is what’s written in the Postgresql documentation: «Whenever you create a table, a composite type is also automatically created, with the same name as the table, to represent the table’s row type.». This means we can alter the “velib_station” table to add a column named data with the type “velib_station_data” and manage to update this column every time we add a record in “velib_station_data”.

This turns our problem to a simple query that retrieve all “velib_station” fields and even better: as soon as you have a “velib_station”, you also own the last data associated with it. This means we have these data for the nearest stations retrieved above.

How to implement that with Pomm ?

Pomm lets you use an EntityConverter to play with entities in your entities. It must be registered to the database:

<?php # Model/Pomm/Database/VelibDb.php
class VlibDb extends Database
{
    protected function initialize()
    {
        parent::initialize();

        $this->registerConverter('Point', new PgPoint(), array('point'));
        $this->registerConverter('VelibStationData', new PgEntity($this, 'ModelPommEntityVlibVelibStationData'), array('velib_station_data'));
    }
}

This will convert a “velib_station_data” type into a “VelibStationData” instance. Let’s bend over the update process, saying the web service’s values are stored under a “$values” variable:

<?php # application.php
[...]
    // Update database
    $data_map = $app['db']->getMapFor('ModelPommEntityVlibVelibStationData');
    $station_map = $app['db']->getMapFor('ModelPommEntityVlibVelibStation');

    try
    {
        $app['db']->begin();
        $velib_station_data = $data_map->createObject();
        $values['station_id'] = $id;
        $velib_station_data->hydrate($values);
        $data_map->saveOne($velib_station_data);
        $velib_station = $station_map->findByPk(array('id' => $id));
        $velib_station->setData($velib_station_data);
        $station_map->updateOne($velib_station, array('data'));
        $app['db']->commit();
    }
    catch (Exception $e)
    {
        $app['db']->rollback();

        throw $e;
    }

The first step is to create a new VelibStationData instance, hydrate it from the web services and save it to the database. Once done, we can update the corresponding VelibStation instance by setting directly data with the VelibStationData instance thanks to the converter. Then we can call the updateOne method just to update that column for this record. Everything is within a transaction to ensure the database remains as is if something goes wrong.

As soon as a station is retrieved, also are the latest informations about it. This means it is now really easy to color the nearest stations depending on the remaining slots they have.

Liens récents 

 Crew - code review for Git projects.

Marqué par greg le 6 mai 2012 07:57..
Tags: geek computing git

Suivre le lien

 responsive grid system

Marqué par greg le 5 mai 2012 12:38..
Tags: geek computing dev

Suivre le lien

 Vim "improved

Marqué par greg le 8 mars 2012 09:26..
Tags: geek computing vim

Suivre le lien

 Highly scalable NoSql blog

Marqué par greg le 3 mars 2012 09:12..
Tags: geek computing dev

Suivre le lien

 Bruce Momjian Pg blog

Marqué par greg le 16 février 2012 13:51..
Tags: geek computing postgresql net blog

Suivre le lien

 Richard Miller's blog

Marqué par greg le 16 février 2012 09:57..
Tags: geek computing dev php symfony net blog

Suivre le lien

 Online SQL explain interface

Marqué par greg le 15 février 2012 09:58..
Tags: geek computing postgresql net

Suivre le lien

 Ace online code editor

Marqué par greg le 15 février 2012 08:57..
Tags: net geek computing dev javascript

Suivre le lien

 Coding horror

Marqué par greg le 14 février 2012 12:42..
Tags: geek computing dev net blog

Suivre le lien

 telegraphCQ - Stream oriented database

Marqué par greg le 13 février 2012 14:44..
Tags: geek computing postgresql

Suivre le lien