Python Numpy: beginning ending indexes of True value in boolean array [duplicate]

社会演员多 • 2023年4月29日上午8:28 • 问题解决 • 阅读 105

问题描述:

Good evening,

Is there a efficient way of getting all beginning and ending indexes of True value in boolean array?
Let’s say I have this array:
x = np.array([nan, 11, 13, nan, nan, nan, 9, 3, nan, 3, 4, nan])

I use np.isnan(x) so I get:
[True, False, F, T, T, T, F, F, T, F, F, T]

I would like to have at the end an array or list with only indexes of nan -> i.e one index if single, or beginning index and ending index if consecutive nan values:
[0, [3, 5], 8, 11]

Do I have to loop on the array myself and write a function or is there a numpy and efficient way of doing it?

I have already something running but as I have to deal with hundred of thousands of values per array and multiples array also, it takes time.

解决方案 1:^[1]

You can use groupby from itertools module:

lst = []
for mask, grp in groupby(zip(np.arange(len(x)), np.isnan(x)), key=lambda x: x[1]):
    if mask == True:  # only for NaN
        idx = [idx for idx, _ in grp]
        lst.append([idx[0], idx[-1]] if len(idx) > 1 else idx[0])

Output:

>>> lst
[0, [3, 5], 8, 11]

解决方案 2:^[2]

You can use boolean operations shifting the np.isnan output on the left/right:

# if the value a NaN?
m = np.isnan(x)
# is the preceding value not a NaN?
m2 = np.r_[False, ~m[:-1]]
# is the following value not a NaN?
m3 = np.r_[~m[1:], False]

out = np.where((m&m2)|(m&m3))[0]

Output:

array([ 0,  3,  5,  8, 11])

解决方案 3:^[3]

I have a function in my library haggis called haggis.npy_util.mask2runs which does almost exactly what you want:

mask2runs(np.isnan(x))

The result will be a two-column array containing the start (inclusive) and end (exclusive) indices for each run:

[[0, 1]
 [3, 6],
 [8, 12]]

You can get the lengths of each run directly by subtraction, or by adding an argument return_lengths=True.

The function is not complicated, and you can replicate it in a one-liner for the example that you have:

 runs = numpy.flatnonzero(np.diff(numpy.r_[numpy.int8(0), mask.view(numpy.int8), numpy.int8(0)])).reshape(-1, 2)

参考链接:

Article Source: StackOverflow

[1] Corralien

[2] mozway

[3] Mad Physicist

Python Numpy: beginning ending indexes of True value in boolean array [duplicate]

问题描述:

解决方案 1:[1]

解决方案 2:[2]

解决方案 3:[3]

参考链接:

相关推荐

解决方案 1:^[1]

解决方案 2:^[2]

解决方案 3:^[3]