如何解决如果有任何坐标落在彼此之间的特定距离内,我如何确定哪些坐标?
from math import radians,cos,sin,asin,sqrt
df = pd.DataFrame(columns=['Id','Feature','Lat','Long'])
df['Id'] = [0,1,2,3,4,5,6,7,8,9,10,11]
df['Feature'] = ['Truck','Truck','Van','Car','Car']
df['Lat'] = [39.57713,39.57723,39.57671,39.57672,39.57697,39.57188,39.57151,39.57153,39.57197,39.57613,39.57577,39.57595]
df['Long'] = [46.87062,46.87004,46.87001,46.87066,46.87027,46.87489,46.87482,46.8752,46.87528,46.8757,46.87572,46.87545]
def haversine(lon1,lat1,lon2,lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1,lat2 = map(radians,[lon1,lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
# Radius of earth in meters is 6371000
distance = 6371000* c
return distance
我如何查看 420 米范围内的卡车/货车 ID、655 米范围内的卡车/货车 ID 以及 425 米范围内的汽车/货车 ID?
理想的输出是:
卡车 3 位于汽车 11 的距离内
Truck 3 在 Van 5 的距离内
10 号车在 Van 8 的距离内
解决方法
您可以使用 pd.merge(how='cross')
生成您想要的所有对:
>>> groups = df.groupby('Feature')
>>> pd.merge(groups.get_group('Car'),groups.get_group('Truck'),how='cross',suffixes=('','_cmp'))
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 9 Car 39.57613 46.87570 0 Truck 39.57713 46.87062
1 9 Car 39.57613 46.87570 1 Truck 39.57723 46.87004
2 9 Car 39.57613 46.87570 2 Truck 39.57671 46.87001
3 9 Car 39.57613 46.87570 3 Truck 39.57672 46.87066
4 9 Car 39.57613 46.87570 4 Truck 39.57697 46.87027
5 10 Car 39.57577 46.87572 0 Truck 39.57713 46.87062
6 10 Car 39.57577 46.87572 1 Truck 39.57723 46.87004
7 10 Car 39.57577 46.87572 2 Truck 39.57671 46.87001
8 10 Car 39.57577 46.87572 3 Truck 39.57672 46.87066
9 10 Car 39.57577 46.87572 4 Truck 39.57697 46.87027
10 11 Car 39.57595 46.87545 0 Truck 39.57713 46.87062
11 11 Car 39.57595 46.87545 1 Truck 39.57723 46.87004
12 11 Car 39.57595 46.87545 2 Truck 39.57671 46.87001
13 11 Car 39.57595 46.87545 3 Truck 39.57672 46.87066
14 11 Car 39.57595 46.87545 4 Truck 39.57697 46.87027
这允许轻松生成我们想要进行的所有比较:
>>> distances = {('Car','Truck'): 420,('Truck','Van'): 655,('Car','Van'): 425}
>>> all_cmp = pd.concat([pd.merge(groups.get_group(dist_from),groups.get_group(dist_to),'_cmp')) for dist_from,dist_to in distances])
>>> all_cmp.head()
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 9 Car 39.57613 46.8757 0 Truck 39.57713 46.87062
1 9 Car 39.57613 46.8757 1 Truck 39.57723 46.87004
2 9 Car 39.57613 46.8757 2 Truck 39.57671 46.87001
3 9 Car 39.57613 46.8757 3 Truck 39.57672 46.87066
4 9 Car 39.57613 46.8757 4 Truck 39.57697 46.87027
>>> all_cmp.tail()
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
7 10 Car 39.57577 46.87572 8 Van 39.57197 46.87528
8 11 Car 39.57595 46.87545 5 Van 39.57188 46.87489
9 11 Car 39.57595 46.87545 6 Van 39.57151 46.87482
10 11 Car 39.57595 46.87545 7 Van 39.57153 46.87520
11 11 Car 39.57595 46.87545 8 Van 39.57197 46.87528
我们可以很容易地计算距离,我们还需要对齐阈值距离:
>>> dist = all_cmp.agg(lambda s: haversine(s['Lat'],s['Long'],s['Lat_cmp'],s['Long_cmp']),axis='columns')
>>> thresh = all_cmp[['Feature','Feature_cmp']].agg(lambda s: distances[tuple(s)],axis='columns')
从那里开始比较,保留你想要的行,可能聚合:
>>> all_cmp[dist < thresh]
Id Feature Lat Long Id_cmp Feature_cmp Lat_cmp Long_cmp
0 0 Truck 39.57713 46.87062 5 Van 39.57188 46.87489
1 0 Truck 39.57713 46.87062 6 Van 39.57151 46.87482
3 0 Truck 39.57713 46.87062 8 Van 39.57197 46.87528
12 3 Truck 39.57672 46.87066 5 Van 39.57188 46.87489
13 3 Truck 39.57672 46.87066 6 Van 39.57151 46.87482
14 3 Truck 39.57672 46.87066 7 Van 39.57153 46.87520
15 3 Truck 39.57672 46.87066 8 Van 39.57197 46.87528
16 4 Truck 39.57697 46.87027 5 Van 39.57188 46.87489
17 4 Truck 39.57697 46.87027 6 Van 39.57151 46.87482
0 9 Car 39.57613 46.87570 5 Van 39.57188 46.87489
1 9 Car 39.57613 46.87570 6 Van 39.57151 46.87482
2 9 Car 39.57613 46.87570 7 Van 39.57153 46.87520
3 9 Car 39.57613 46.87570 8 Van 39.57197 46.87528
4 10 Car 39.57577 46.87572 5 Van 39.57188 46.87489
5 10 Car 39.57577 46.87572 6 Van 39.57151 46.87482
6 10 Car 39.57577 46.87572 7 Van 39.57153 46.87520
7 10 Car 39.57577 46.87572 8 Van 39.57197 46.87528
8 11 Car 39.57595 46.87545 5 Van 39.57188 46.87489
9 11 Car 39.57595 46.87545 6 Van 39.57151 46.87482
10 11 Car 39.57595 46.87545 7 Van 39.57153 46.87520
11 11 Car 39.57595 46.87545 8 Van 39.57197 46.87528
>>> close = all_cmp[dist < thresh].groupby('Id')['Id_cmp'].agg(list)
>>> close
Id
0 [5,6,8]
3 [5,7,8]
4 [5,6]
9 [5,8]
10 [5,8]
11 [5,8]
Name: Id_cmp,dtype: object
>>> df.merge(close.rename('within dist').reset_index())
Id Feature Lat Long within dist
0 0 Truck 39.57713 46.87062 [5,8]
1 3 Truck 39.57672 46.87066 [5,8]
2 4 Truck 39.57697 46.87027 [5,6]
3 9 Car 39.57613 46.87570 [5,8]
4 10 Car 39.57577 46.87572 [5,8]
5 11 Car 39.57595 46.87545 [5,8]
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。