Skip to content

ordered query API #1195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions text/0000-ordered-queries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
- Feature Name: ordered-queries
- Start Date: 2015-09-25
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Add the following to BTreeMap

* min
* max
* get_le
* get_lt
* get_ge
* get_gt

* min_mut
* max_mut
* get_le_mut
* get_lt_mut
* get_ge_mut
* get_gt_mut


and to BTreeSet:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these names have precedence from other libraries? They seem a bit too succinct to me (although a big plus one to the actual functionality, I've wanted this).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java: higher/lower/ceil/floor
C++: lower_bound/upper_bound (these names are terrible and I explicitly killed them in collections reform)
Everything else I looked at: chaos or doesn't seem to have this precise collection/functionality.

I briefly pondered before/after and next/prev before letting my theory background take over and demand predecessor/successor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another potential naming scheme could involve {lt, le, ge, gt}, optionally with a prefix or suffix if we're concerned about conflicting with PartialOrd's methods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas:
before, after, before_eq, after_eq
find_{lt, le, ge, gt}
get_{lt, le, ge, gt}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally the lack of genericity over mutability is killing me. Don't how I'd do it, but there's so much repetition in API's these days because of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh damn right I wanted to avoid that auuuugh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is {next, next_or_eq, prev, prev_or_eq, first, last}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do really like that lt/leq/etc is an established naming convention that people can bring into understanding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leq or le? The former might be easier to grok, but the latter is consistent with PartialOrd and has the minor benefit of having the same number of characters as {lt, gt}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops, I thought that PartialOrd used leq.

* min
* max
* get_le
* get_lt
* get_ge
* get_gt



# Motivation

Currently the only option people have to do ordered queries on a BTreeMap is to make a full-blown
`range` iterator with carefully selected bounds. However constructing an iterator can require
a significantly larger amount of state construction. In particular our current BTreeMap
implementation suffers greatly due to its expensive iterator design (allocates a VecDeque for the
whole search path). A BTreeMap with parent pointers could potentially avoid any performance hit
modulo, but this is a more general problem for the *ordered map* API. There are surely types for
which a straight-up query will be cheaper than iterator initialization.

It is also significantly more ergonomic/discoverable to have `get_le_mut(&K)` over
`range_mut(Bound::Unbounded, Bound::Inclusive(&K)).next_back()`.




# Detailed design


The BTreeMap APIs are as follows:

get_(le|lt|gt|ge):

```rust
fn get_le<Q: ?Sized>(&self, &Q) -> Option<(&K, &V)>
where K: Borrow<Q>;
```

get_(le|lt|gt|ge)_mut:

```rust
fn get_le_mut<Q: ?Sized>(&mut self, &Q) -> Option<(&K, &mut V)>
where K: Borrow<Q>;
```

min|max:

```rust
fn min(&self) -> Option<(&K, &V)>;
```

(min|max)_mut:

```rust
fn min_mut(&mut self) -> Option<(&K, &mut V)>;
```


BTreeSet gets the equivalent APIs with the value part of the return removed.


Note that in contrast to `get` the key is yielded because the key that matched the query is
unquestionably new information for a querier. Also we don't want to have to add _keyed variants
of all these *later*.




# Drawbacks

Weep before the might of combinatorics, destroyer of API designs.




# Alternatives

## Use Bounds to unify inc/exc/extreme

```rust
fn pred(&self, Bound<&Q>)
fn succ(&self, Bound<&Q>)
fn pred_mut(&self, Bound<&Q>)
fn succ_mut(&self, Bound<&Q>)
```

where `pred(Unbounded)` is max, and `succ(Unbounded)` in min by assuming you're getting the
predecessor and successor of positive and negative infinity. This RFC does not propose this
API because it is in the author's opinion awful and would make our users cry.



## Use a custom enum to capture all variability

Take enums instead of having many methods:

```rust
enum Query<T> {
Min,
Lt(T),
Le(T),
Ge(T),
Gt(T),
Max,
}

impl<K, V> Map<K, V> {
fn query<Q: ?Sized>(&self, query: Query<&Q>) -> Option<(&K, &V)>
where K: Borrow<Q>;

fn query_mut<Q: ?Sized>(&self, query: Query<&Q>) -> Option<(&K, &mut V)>
where K: Borrow<Q>;
}
```

But this is just shuffling around the complexity, and making a more painful calling convention
that involves importing names:

```rust
use std::collections::Query;
let result = map.query(Query::Lt("hello"));
let result = map.query_mut(Query::Max);

// pulled in enum
use std::collections::Query::*;
let result = map.query(Lt("hello"));
let result = map.query_mut(Max);
```

vs

```rust
let result = map.get_lt("hello");
let result = map.max_mut();
```



## Give mutable access to keys

We could also give `&mut` access to the keys in the `_mut` variants. This would enable
changing "unimportant" information in the keys without resorting to interior mutability
mechanisms. It would allow BTreeSet to have _mut variants of all these methods. This RFC
does not propose this because it's probably a really big footgun while also being quite niche.



# Unresolved questions

Nothing. This is pretty straightforward.