Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[12.x] Query builder PDO fetch modes #54443

Merged
merged 24 commits into from
Feb 17, 2025
Merged

Conversation

bert-w
Copy link
Contributor

@bert-w bert-w commented Feb 2, 2025

Hi, this is not the first time I've requested this idea but I would like you to reconsider the following addition for Laravel 12.x since I think it opens up many many performance improvements. This PR has simplified the premise a little bit by only relying on a single new $query->fetchUsing()->... function.

PDO fetch modes allow developers to control the way how database statements are retrieved/used in PHP. Currently in Laravel, the default mode is PDO::FETCH_OBJ which returns a php object for every row in the result set. By making this available as a query modifier, certain optimizations can be achieved because the data format is defined through PDO.

There are various fetch modes that can offer greater efficiency of the code, e.g. retrieving a dictionary of 2 columns with PDO::FETCH_KEY_PAIR (which is exactly the query builder pluck() function) or keying a resultset instantly by some other column (PDO::FETCH_UNIQUE). By using specific PDO modes, a lot of processing that was traditionally done using Laravel's Collection methods can now be done using a simple query modifier.

TLDR

This PR adds the following:

  • Added Query Builder fetchUsing(...$args) function for a new way to modify the format of the query result.
  • Simplified Query Builder pluck() to use this new PDO mode to remove unneeded code and speed up the execution.
  • Changed several Connection method signatures to accept an optional array $fetchUsing = [] argument (Breaking Change -> Laravel 12).
  • Removed protected Query Builder function onceWithColumns because it behaves weirdly (it only actually uses your arguments if no columns have been assigned). 2 internal references have been simplified to not use this function anymore.

Usage examples

Key-value pairs (pluck)

By selecting only two columns and using PDO::FETCH_KEY_PAIR, the result is automatically returned as a dictionary, requiring no further processing:

User::query()->select(['id', 'username'])->fetchUsing(\PDO::FETCH_KEY_PAIR)->get()->toArray();
array:3 [
  1 => "ape"
  2 => "bear"
  3 => "snake"
]

Or as a list of values:

User::query()->toBase()->select(['username'])->fetchUsing(\PDO::FETCH_COLUMN)->get()->toArray();
array:3 [
  0 => "ape"
  1 => "bear"
  2 => "snake"
]

Both of the syntaxes above have been worked into the pluck() method, so you can just call User::query()->pluck() and benefit from both the key-value and list-value optimizations.

Key a collection by a given column (keyBy)

When using PDO::FETCH_UNIQUE, the first column in the select-statement is consumed as the key for the output array, so it functions similar to $collection->keyBy('id') but instead it is returned directly from PDO (so no more $collection->keyBy() required):

User::query()->select(['id', 'users.*'])->fetchUsing(\PDO::FETCH_UNIQUE)->get()->toArray();
array:3 [
  1 => array:4 [
    "id" => 1
    "username" => "ape"
    "created_at" => "2025-02-02T17:46:22.000000Z"
    "updated_at" => "2025-02-02T17:46:22.000000Z"
  ]
  2 => array:4 [
    "id" => 2
    "username" => "bear"
    "created_at" => "2025-02-02T17:46:23.000000Z"
    "updated_at" => "2025-02-02T17:46:23.000000Z"
  ]
  3 => array:4 [
    "id" => 3
    "username" => "snake"
    "created_at" => "2025-02-02T17:46:23.000000Z"
    "updated_at" => "2025-02-02T17:46:23.000000Z"
  ]
]

This will come in handy for various algorithms, since the array keys now have a meaning and can be used to retrieve data.

Group x by y (mapToGroups)

A bit of an odd one, but say you have a column role which contains either 'admin','moderator','user' and you want to retrieve usernames for each of the groups. Traditionally, you'd have to loop over the array in PHP and build these groups yourself using for instance $collection->mapToGroups(...). There is however a much faster way:

User::query()->toBase()->select(['role', 'username'])->fetchUsing(\PDO::FETCH_GROUP | \PDO::FETCH_COLUMN)->get()->toArray();
array:3 [
  "user" => array:3 [
    0 => "Jonas"
    1 => "Jonas"
    2 => "John"
  ]
  "moderator" => array:1 [
    0 => "Jimmy"
  ]
  "admin" => array:1 [
    0 => "Jake"
  ]
]

Cursors

All of the above will also be available for the $query->cursor() function so you can lazily process your data.

Benchmarks

Test file: https://gist.github.com/bert-w/3f716af1741c380a8517ea6a7e757854

If you want to run this yourself, note that you have to re-add the old pluck() function inside src\Illuminate\Database\Query\Builder.php and rename it to pluckOld(), including the functions onceWithColumns, stripTableForPluck, pluckFromObjectColumn, pluckFromArrayColumn (they are not necessary anymore after this PR).

 ┌─────────────┬────────┬───────────┬────────────┬─────────────┐
 │ Test        │ Branch │ # Records │ Time (ms)  │ Improvement │
 ├─────────────┼────────┼───────────┼────────────┼─────────────┤
 │ keyBy       │ branch │ 5         │ 0.1951882  │ 3.6%        │
 │ keyBy       │ master │ 5         │ 0.2024828  │             │
 │ keyBy       │ branch │ 50        │ 0.2679888  │ 4.25%       │
 │ keyBy       │ master │ 50        │ 0.279884   │             │
 │ keyBy       │ branch │ 500       │ 1.2666686  │ 8.94%       │
 │ keyBy       │ master │ 500       │ 1.3910648  │             │
 │ keyBy       │ branch │ 5000      │ 9.3656038  │ 14.9%       │
 │ keyBy       │ master │ 5000      │ 11.0052914 │             │
 │ keyBy       │ branch │ 25000     │ 47.1913486 │ 15.03%      │
 │ keyBy       │ master │ 25000     │ 55.5400322 │             │
 │ mapToGroups │ branch │ 5         │ 0.1496776  │ 4.99%       │
 │ mapToGroups │ master │ 5         │ 0.1575438  │             │
 │ mapToGroups │ branch │ 50        │ 0.195247   │ 10.72%      │
 │ mapToGroups │ master │ 50        │ 0.218697   │             │
 │ mapToGroups │ branch │ 500       │ 0.5607566  │ 22.3%       │
 │ mapToGroups │ master │ 500       │ 0.721686   │             │
 │ mapToGroups │ branch │ 5000      │ 3.4245092  │ 31.18%      │
 │ mapToGroups │ master │ 5000      │ 4.9759744  │             │
 │ mapToGroups │ branch │ 25000     │ 15.2814816 │ 39.06%      │
 │ mapToGroups │ master │ 25000     │ 25.0745594 │             │
 │ pluck       │ branch │ 5         │ 0.1543958  │ 4.79%       │
 │ pluck       │ master │ 5         │ 0.1621634  │             │
 │ pluck       │ branch │ 50        │ 0.1914712  │ 3.45%       │
 │ pluck       │ master │ 50        │ 0.1983118  │             │
 │ pluck       │ branch │ 500       │ 0.531743   │ 9.38%       │
 │ pluck       │ master │ 500       │ 0.586767   │             │
 │ pluck       │ branch │ 5000      │ 3.2316182  │ 16.54%      │
 │ pluck       │ master │ 5000      │ 3.8720818  │             │
 │ pluck       │ branch │ 25000     │ 13.832984  │ 26.2%       │
 │ pluck       │ master │ 25000     │ 18.7447848 │             │
 └─────────────┴────────┴───────────┴────────────┴─────────────┘

Conclusion

The whole reason for this PR is to run code as efficiently as possible. The more we can ask from PDO to build the correct data structure, the less we have to rely on Laravel Collection methods to reprocess our data.

All of the modes mentioned above can help tremendously when building efficient algorithms to process data from your database. Please look carefully at this PR and communicate any issues you might see with this implementation.

Some other resouces to have a look at if you want to know more:

Copy link

github-actions bot commented Feb 2, 2025

Thanks for submitting a PR!

Note that draft PR's are not reviewed. If you would like a review, please mark your pull request as ready for review in the GitHub user interface.

Pull requests that are abandoned in draft may be closed due to inactivity.

@bert-w bert-w marked this pull request as ready for review February 7, 2025 13:00
@shaedrich
Copy link
Contributor

shaedrich commented Feb 7, 2025

Would it make sense to be able to pass this to query() as well?

@bert-w
Copy link
Contributor Author

bert-w commented Feb 7, 2025

Would it make sense to be able to pass this to query() as well?

Yes you can already do that since this method is automatically forwarded to the underlying Query\Builder if the method doesnt exist on Eloquent\Builder, e.g.:

User::query()->select(['id','users.*'])->fetchUsing(PDO::FETCH_UNIQUE)->get();

Would give you a collection where the keys are User ids and the values are User models.

Note that in this case, the ->get() function is the one from Eloquent\Builder which hydrates the database result into Eloquent models, so the fetchUsing() mode must take that into account. When you call Query\Builder::get(), the query result is simply wrapped into a Collection.

@shaedrich
Copy link
Contributor

I actually meant more like this:

User::query(PDO::FETCH_UNIQUE)->select(['id','users.*'])->get();

@flavio-schoute
Copy link
Contributor

I actually meant more like this:

User::query(PDO::FETCH_UNIQUE)->select(['id','users.*'])->get();

I like this approach but it feels a little bit weird 🤔

@shaedrich
Copy link
Contributor

shaedrich commented Feb 12, 2025

I actually meant more like this:

User::query(PDO::FETCH_UNIQUE)->select(['id','users.*'])->get();

I like this approach but it feels a little bit weird 🤔

@flavio-schoute Because we're passing a PDO bitmask constant into a Laravel method? Yeah, that could be wrapped in an enum perhaps 🤔

@taylorotwell
Copy link
Member

@bert-w can you fix the conflicts on this?

@taylorotwell taylorotwell marked this pull request as draft February 14, 2025 16:08
@bert-w
Copy link
Contributor Author

bert-w commented Feb 14, 2025

@taylorotwell conflicts are now resolved. There appear to be some SSL/TLS errors though.

I removed/reverted a test that was added in #54396 since it no longer applies in this setup. It tested the case where if you do $query->pluck('foo', 'foo');, it should generate the SQL SELECT foo FROM x instead of SELECT foo, foo FROM x. However the new setup actually requires both columns (also the PR doesn't optimize anything since repeating a column in SQL is negligible in terms of performance loss/gain).

@bert-w bert-w marked this pull request as ready for review February 14, 2025 16:45
@taylorotwell
Copy link
Member

@bert-w is it possible to revert all changes to pluck in this PR? It feels outside of the scope and would make it easier to merge / less risky to only add the fetch mode stuff.

@bert-w
Copy link
Contributor Author

bert-w commented Feb 14, 2025

@taylorotwell I have reverted most changes to the query builder except the protected function onceWithColumns(...) removal since it resolves a bug and simplifies the code.

Bug description:

User::query()->select(['id'])->pluck('name', 'id');
// Error: Undefined property: stdClass::$name

The query above doesn't work because onceWithColumns doesnt do what it says (it only changes $this->columns when it is null). So the query only contains the id column and not the columns you define as pluck(...) arguments.

The same thing happens at $query->select(['a'])->get(['b']), but here there might be cases where people unknowningly use this behaviour since it doesn't result in an error directly. The refactor for the get() function therefore uses:

// \Illuminate\Database\Query\Builder.php::3099
$this->columns ??= Arr::wrap($columns);

which functions exactly the same as without the onceWithColumns wrapper but is much easier to read.

@bert-w bert-w marked this pull request as ready for review February 14, 2025 20:49
@taylorotwell taylorotwell merged commit ca7ca2c into laravel:master Feb 17, 2025
39 checks passed
@macropay-solutions
Copy link

macropay-solutions commented Feb 18, 2025

Sorry for the late question. Is this supposed to be public? It has a setter and it is used only in the query builder.
(We are adapting our API crud AUTO filtering lib to laravel 12 and because we create temporary tables for the sub-selects when filtering by relation column condition, we must add also this $fetchUsing into the lib, thus we want to make sure this will not change into protected) Thank you.

    /**
     * The custom arguments for the PDOStatement::fetchAll / fetch functions.
     *
     * @var array
     */
    public array $fetchUsing = [];

@bert-w
Copy link
Contributor Author

bert-w commented Feb 18, 2025

@macropay-solutions I followed the rest of the properties in the class (of which most are public). There are no getters for these properties so having them public is the only way to access them in your code (regardless of if you should).

I guess another good question you could ask is: why is there an protected $afterQueryCallbacks = []; while all the other properties are public :)

@macropay-solutions
Copy link

@bert-w

Thank you.

We will overwrite it as public then to keep support for laravel/lumen >=8.

No additional questions from our side.

PS. it is a nice feature when you know what you are doing.

@shaedrich
Copy link
Contributor

I followed the rest of the properties in the class (of which most are public).

While this might make sense stylistically, it might actually point out flaws in the design which you might have added to, but if it would have been a larger issue, Taylor sure would have said so :)

@macropay-solutions
Copy link

macropay-solutions commented Feb 18, 2025

@taylorotwell will this remain public or not?

    /**
     * The custom arguments for the PDOStatement::fetchAll / fetch functions.
     *
     * @var array
     */
    public array $fetchUsing = [];

UPDATE

Taking into account the rollback, this question is obsolete. We will keep the implementation in our lib anyway.

@macropay-solutions
Copy link

@bert-w Taking into account that this MR was reverted, can you please confirm that onceWithColumns function will work with (only) these changes:

    public array $fetchUsing = [];


    public function fetchUsing(...$fetchUsing): static
    {
        $this->fetchUsing = $fetchUsing;

        return $this;
    }

    ...
    $statement->fetchAll(...$fetchUsing);
    ...

@decadence
Copy link
Contributor

@macropay-solutions it's merged in another PR with fixes

@macropay-solutions
Copy link

macropay-solutions commented Mar 3, 2025

@decadence can you pls link that PR? the code
public array $fetchUsing = [];

is not in 12.x branch

public function select($query, $bindings = [], $useReadPdo = true)

return $statement->fetchAll();

@decadence
Copy link
Contributor

@macropay-solutions #54734
You're right. Looks like it exists only in master branch.

@bert-w
Copy link
Contributor Author

bert-w commented Mar 3, 2025

@macropay-solutions this PR was reverted because of a bug, but a day later I made a one-line fix and it was merged in again. This feature was probably too close to the 12.x release date, so that might be the reason that it now remains on master (12.x was branched off just a few commits earlier so this was not included).

I guess it will become part of 13.x in the future since this PR brings some minor breaking changes.

@macropay-solutions
Copy link

macropay-solutions commented Mar 4, 2025

@bert-w
Thanks for the info.
But still, we added your feature to our lib for backward compatibility.
Can you please confirm that JUST passing the options to ->fetchAll(...$fetchUsing) will work in older versions with the function onceWithColumns (that you removed)?

@bert-w
Copy link
Contributor Author

bert-w commented Mar 5, 2025

@macropay-solutions I have no clue what exactly your lib does, however $fetchUsing is an empty array by default, so calling ->fetchAll(...$fetchUsing) is the same as ->fetchAll() in those cases. onceWithColumns was an internal function, so if people depend on that it's likely that they'll have issues on every major release.

@macropay-solutions
Copy link

macropay-solutions commented Mar 5, 2025

@bert-w rephrasing :)

In laravel 8, 9, 10, 11, 12, (disregarding our lib) imagine we hardcode in vendor a fetch mode in that function call

->fetchAll(/*we put a pdo mode here manually*/)

Will it work in the sense that the function onceWithColumns will be able to handle the result?

Because if not, we have to throw an exception when the function fetchUsing is used to prevent devs using it until they use the lavarel version in which it will be published.

@bert-w
Copy link
Contributor Author

bert-w commented Mar 5, 2025

@macropay-solutions it will pass through onceWithColumns without issue. That function simply does this:

  1. Set columns on query (if not set already) and save the current columns as backup
  2. Run query
  3. Revert columns to the backup

Stuff may start to go wrong depending on what you do with your query afterwards. I suggest you just take a look at the code because you could've gotten your answer 2 weeks ago; the changes aren't that complex really :)

Honestly if you don't know whether stuff will work in your application for version x under feature y, you obviously need to write some tests.

@macropay-solutions
Copy link

macropay-solutions commented Mar 5, 2025

@bert-w
Thank you.

Our tests could not cover all situations, that is why we asked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants