sequence vs. list (vs. nlist)

Hi all,

[using MPS 2017.2.3 with Windows 7]

I have a checking rule that makes heavy use of sequences. It performs some checks on a big sequence and filters down the sequence (partly with a recursive function) until the sequence only contains items that don't meet the requirements of the check.

Eventually, I build an error message for the user, where I access the final sequence (thus actually building it, I suppose). WIth some sample models, this check takes 8-10 seconds per run (!) with my current implementation, which is quite a lot.

When I replace the sequence type in my implementation with list OR nlist, the check takes only 150-200 ms. I did not change any logic (still using recursion, still using higher order functions like disjunction and union).

Can somebody reason about that? Is the generated code significantly different when using list vs. sequence? I recently tried to make heavy use of sequences since out memory footprint is already pretty big and I was hoping that we could minimize the memory footprint by using sequences over list, but maybe this is a flawed assumption already. 

Any information on when and how to use sequences would be appreciated, but in particular I would be interested to learn if somebody could reason about above mentioned differences in runtime. 

1
3 comments

Operations on sequences are executed at the time you iterate over the sequence. When you write sequence.where(...) the filter is not applied immediately, but a new FilteringSequence is created.

Every time you iterate over a sequence all filter/map/... functions are applied again. Sequences are useful to save memory when you chain multiple operations and use the resulting sequence only once. If you use the result multiple times it makes sense to cache the computations by creating a list from the sequence.

2

Thanks Sascha, your explanation meets my expectations. It still seems odd to me that just iterating over the sequence once (which is what I measured) takes ~4.5 seconds. The checking rules in question iterates twice in total over the sequence, which leads to the 8-10 seconds I was mentioning in my original post per run.
Iterating the sequence itself can't be the issue since the sequence is empty most of the time. Iteration (e.g. by calling '.size') still takes several seconds.

1

Here is a screenshot of a profiling session I've done. The total times are way bigger with the profilier attached, but they still reflect the dimensions. Not sure if this provides any additional insight, but the getString method on SPropertyOperations seems to do some heavy lifting.

0

Please sign in to leave a comment.