问题描述:
Good evening,
Is there a efficient way of getting all beginning and ending indexes of True value in boolean array?
Let’s say I have this array:
x = np.array([nan, 11, 13, nan, nan, nan, 9, 3, nan, 3, 4, nan])
I use np.isnan(x) so I get:
[True, False, F, T, T, T, F, F, T, F, F, T]
I would like to have at the end an array or list with only indexes of nan -> i.e one index if single, or beginning index and ending index if consecutive nan values:
[0, [3, 5], 8, 11]
Do I have to loop on the array myself and write a function or is there a numpy and efficient way of doing it?
I have already something running but as I have to deal with hundred of thousands of values per array and multiples array also, it takes time.
解决方案 1:[1]
You can use groupby
from itertools
module:
lst = []
for mask, grp in groupby(zip(np.arange(len(x)), np.isnan(x)), key=lambda x: x[1]):
if mask == True: # only for NaN
idx = [idx for idx, _ in grp]
lst.append([idx[0], idx[-1]] if len(idx) > 1 else idx[0])
Output:
>>> lst
[0, [3, 5], 8, 11]
解决方案 2:[2]
You can use boolean operations shifting the np.isnan
output on the left/right:
# if the value a NaN?
m = np.isnan(x)
# is the preceding value not a NaN?
m2 = np.r_[False, ~m[:-1]]
# is the following value not a NaN?
m3 = np.r_[~m[1:], False]
out = np.where((m&m2)|(m&m3))[0]
Output:
array([ 0, 3, 5, 8, 11])
解决方案 3:[3]
I have a function in my library haggis called haggis.npy_util.mask2runs
which does almost exactly what you want:
mask2runs(np.isnan(x))
The result will be a two-column array containing the start (inclusive) and end (exclusive) indices for each run:
[[0, 1]
[3, 6],
[8, 12]]
You can get the lengths of each run directly by subtraction, or by adding an argument return_lengths=True
.
The function is not complicated, and you can replicate it in a one-liner for the example that you have:
runs = numpy.flatnonzero(np.diff(numpy.r_[numpy.int8(0), mask.view(numpy.int8), numpy.int8(0)])).reshape(-1, 2)
参考链接:
Copyright Notice: This article follows StackOverflow’s copyright notice requirements and is licensed under CC BY-SA 3.0.
Article Source: StackOverflow
[1] Corralien
[2] mozway
[3] Mad Physicist