Jump to page: 1 2
Thread overview
October 13

Hello there.

I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?

Currently there exists private members in druntime etc for this but they are not exposed.

Basically I would like to be able to reserve n buckets before populating.

For reference, see std::unordered_map, which is kvp, which has rehashing etc:

  template<typename _Key, typename _Tp,
	   typename _Hash = hash<_Key>,
	   typename _Pred = equal_to<_Key>,
	   typename _Alloc = allocator<std::pair<const _Key, _Tp>>>
    class unordered_map
    {
      typedef __umap_hashtable<_Key, _Tp, _Hash, _Pred, _Alloc>  _Hashtable;
      _Hashtable _M_h;

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/unordered_map.h

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/hashtable.h

October 13

On Friday, 13 October 2023 at 11:32:29 UTC, Imperatorn wrote:

>

Hello there.

I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?

Currently there exists private members in druntime etc for this but they are not exposed.

Basically I would like to be able to reserve n buckets before populating.

For reference, see std::unordered_map, which is kvp, which has rehashing etc:

  template<typename _Key, typename _Tp,
	   typename _Hash = hash<_Key>,
	   typename _Pred = equal_to<_Key>,
	   typename _Alloc = allocator<std::pair<const _Key, _Tp>>>
    class unordered_map
    {
      typedef __umap_hashtable<_Key, _Tp, _Hash, _Pred, _Alloc>  _Hashtable;
      _Hashtable _M_h;

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/unordered_map.h

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/hashtable.h

It's something I've wanted. There was a discussion this summer: https://forum.dlang.org/post/u8fp23$2462$1@digitalmars.com

October 14

On Friday, 13 October 2023 at 11:32:29 UTC, Imperatorn wrote:

>

Hello there.

I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?

Currently there exists private members in druntime etc for this but they are not exposed.

Basically I would like to be able to reserve n buckets before populating.

There was a PR for that but it wasn't merged. It only reserved bucket space, not space for elements. See:
https://github.com/dlang/druntime/pull/1929#issuecomment-336799304

October 14

On Saturday, 14 October 2023 at 20:46:45 UTC, Nick Treleaven wrote:

>

On Friday, 13 October 2023 at 11:32:29 UTC, Imperatorn wrote:

>

Hello there.

I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?

Currently there exists private members in druntime etc for this but they are not exposed.

Basically I would like to be able to reserve n buckets before populating.

There was a PR for that but it wasn't merged. It only reserved bucket space, not space for elements. See:
https://github.com/dlang/druntime/pull/1929#issuecomment-336799304

Interesting. I will read more about it. Pros and cons etc. Thanks

October 14
On Friday, October 13, 2023 5:32:29 AM MDT Imperatorn via Digitalmars-d wrote:
> Hello there.
>
> I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?
>
> Currently there exists private members in druntime etc for this but they are not exposed.
>
> Basically I would like to be able to reserve n buckets before populating.
>
> For reference, see std::unordered_map, which is kvp, which has rehashing etc:
>
> ```cpp
>    template<typename _Key, typename _Tp,
>      typename _Hash = hash<_Key>,
>      typename _Pred = equal_to<_Key>,
>      typename _Alloc = allocator<std::pair<const _Key, _Tp>>>
>      class unordered_map
>      {
>        typedef __umap_hashtable<_Key, _Tp, _Hash, _Pred, _Alloc>
> _Hashtable;
>        _Hashtable _M_h;
> ```
>
> https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/ unordered_map.h
>
> https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/ hashtable.h

My main concern here would be that we not add calls which relate to how AAs are currently implemented but which may not relate to how they're implemented in the future if/when we make changes, and I'm inclined to think that anything to do with buckets is an implementation detail, making reserving not something that users should really be doing. It _might_ make sense to have a general reserve function that says that you're planning to insert X number of elements, allowing the AA to pre-allocate some stuff based on that information, but I don't know how much sense that really makes given that what that would mean for a given AA implementation could differ quite a bit.

If rehashing is what you want, we already have that:

https://dlang.org/spec/hash-map.html#properties

Ultimately though, I think that we want to be careful about how much we expose of the AA, since that's going to restrict what we can do to improve it in the future, and if someone wants to do much a tweaking, IMHO, a user-defined type would make more sense than the language-provided AAs.

Steven would likely have a much better idea of how much sense this idea makes given that he's been working on the AA stuff and is familiar with some of the changes that have been made to it over time.

- Jonathan M Davis



October 15
On Sunday, 15 October 2023 at 00:14:39 UTC, Jonathan M Davis wrote:

> My main concern here would be that we not add calls which relate to how AAs are currently implemented but which may not relate to how they're implemented in the future if/when we make changes, and I'm inclined to think that anything to do with buckets is an implementation detail, making reserving not something that users should really be doing. It _might_ make sense to have a general reserve function that says that you're planning to insert X number of elements, allowing the AA to pre-allocate some stuff based on that information, but I don't know how much sense that really makes given that what that would mean for a given AA implementation could differ quite a bit.
>
> If rehashing is what you want, we already have that:
>
> https://dlang.org/spec/hash-map.html#properties
>
> Ultimately though, I think that we want to be careful about how much we expose of the AA, since that's going to restrict what we can do to improve it in the future, and if someone wants to do much a tweaking, IMHO, a user-defined type would make more sense than the language-provided AAs.
>
> Steven would likely have a much better idea of how much sense this idea makes given that he's been working on the AA stuff and is familiar with some of the changes that have been made to it over time.
>
> - Jonathan M Davis

Someone wanting to work on this could build on the EMSI hashmap: https://github.com/dlang-community/containers/blob/master/src/containers/hashmap.d That would leave it as a library solution rather than messing with the AAs in the language.

October 15
On Sunday, 15 October 2023 at 00:14:39 UTC, Jonathan M Davis wrote:
> On Friday, October 13, 2023 5:32:29 AM MDT Imperatorn via Digitalmars-d wrote:
>> [...]
>
> My main concern here would be that we not add calls which relate to how AAs are currently implemented but which may not relate to how they're implemented in the future if/when we make changes, and I'm inclined to think that anything to do with buckets is an implementation detail, making reserving not something that users should really be doing. It _might_ make sense to have a general reserve function that says that you're planning to insert X number of elements, allowing the AA to pre-allocate some stuff based on that information, but I don't know how much sense that really makes given that what that would mean for a given AA implementation could differ quite a bit.
>
> [...]

I think you're basically right. But what could be done is to instead of reserving n buckets like the other languages do, maybe we could (if gc is used in this example) make a calculated gc reserve.

But, I've taken a quick look at 3-4 other languages, and they basically reserve buckets.
October 15
On Sunday, 15 October 2023 at 06:56:25 UTC, Imperatorn wrote:
>
> I think you're basically right. But what could be done is to instead of reserving n buckets like the other languages do, maybe we could (if gc is used in this example) make a calculated gc reserve.
>
> But, I've taken a quick look at 3-4 other languages, and they basically reserve buckets.

In java for example you can list initial map size in elements not buckets: https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#HashMap-int-

If bucket reservation is added, it should be heavily emphasized that method is implementation specific, and no guarantees that would be available in alternative implementations. I.e. it should not be part of public AA interface.

Best regards,
Alexandru.
October 15
On Sunday, 15 October 2023 at 09:06:30 UTC, Alexandru Ermicioi wrote:
> On Sunday, 15 October 2023 at 06:56:25 UTC, Imperatorn wrote:
>>
>> I think you're basically right. But what could be done is to instead of reserving n buckets like the other languages do, maybe we could (if gc is used in this example) make a calculated gc reserve.
>>
>> But, I've taken a quick look at 3-4 other languages, and they basically reserve buckets.
>
> In java for example you can list initial map size in elements not buckets: https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#HashMap-int-
>
> If bucket reservation is added, it should be heavily emphasized that method is implementation specific, and no guarantees that would be available in alternative implementations. I.e. it should not be part of public AA interface.
>
> Best regards,
> Alexandru.

Agreed
October 15

On 10/14/23 4:46 PM, Nick Treleaven wrote:

>

On Friday, 13 October 2023 at 11:32:29 UTC, Imperatorn wrote:

>

Hello there.

I'm wondering if anyone would be opposed to us/me making associative array having resize/reserve public?

Currently there exists private members in druntime etc for this but they are not exposed.

Basically I would like to be able to reserve n buckets before populating.

There was a PR for that but it wasn't merged. It only reserved bucket space, not space for elements. See:
https://github.com/dlang/druntime/pull/1929#issuecomment-336799304

Yeah, as identified by Andrei (and myself), reserving for an AA means reserving elements, not just buckets.

It does not help to compare to other languages, because D's semantic requirements for AAs require that you can keep a reference to an element long after the AA is gone and deallocated (C++ says as soon as you remove an element, all pointers to it are invalid).

The choices are bleak, especially without help from the GC. It's doable, but requires some mechanism that doesn't yet exist. I identified the options here: https://issues.dlang.org/show_bug.cgi?id=17881

-Steve

« First   ‹ Prev
1 2