如何解决用双高斯拟合数据
我正在尝试使用双高斯分布拟合某些数据。数据看起来几乎是完美的高斯分布,但是我想尝试一下,无论我输入的是最初的猜测如何,我都无法获得比特定形状更好的拟合度。我尝试使用下面列出的两个高斯方程,但都不合适。总的来说,我希望它在连续体上更平整(没有“翅膀”),并在可能的情况下更平滑,更贴近实际形状。
由于后续分析的性质,由于我需要拟合参数,因此拟合需要为双重高斯型,因此无法考虑其他拟合方法。数据可以在这里找到: https://docs.google.com/spreadsheets/d/1kMO2ogAL8ZCiDeY29kBvv5lzMfAD7dLj-5rKW8kW9Go/edit?usp=sharing
下面是我用来尝试拟合数据和输出图形的代码示例。
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats
from scipy.optimize import curve_fit
from lmfit import Model
with open("data.txt","r") as f:
content=[i.strip() for i in f.readlines()]
vel=[]
I=[]
dI=[]
for i in range(8,len(content)):
line=content[i].split()
vel.append(float(line[0]))
I.append(float(line[1]))
dI.append(float(line[2]))
def gaussian(x,A,x0,sig):
return A*np.exp(-(x-x0)**2/(2*sig**2))
def gaussian2(x,amp,cen,wid):
return (amp/(np.sqrt(2*np.pi)*wid))*np.exp(-(x-cen)**2/(2*wid**2))
def multi_gaussian(x,*pars):
offset = pars[-1]
g1 = gaussian(x,pars[1],pars[0],pars[2])
g2 = gaussian(x,pars[3],pars[4])
return g1 + g2 + offset
def multi_gaussian2(x,*pars):
offset = pars[-1]
g1 = gaussian2(x,pars[2])
g2 = gaussian2(x,pars[4])
return g1 + g2 + offset
offset=1
guess = [-15,-0.01,10,1]
popt,pcov = curve_fit(multi_gaussian,vel,I,guess)
popt2,pcov2 = curve_fit(multi_gaussian2,guess)
x=np.linspace(np.min(vel),np.max(vel),2000)
plt.figure()
plt.scatter(vel,s=0.1,c='b')
plt.plot(x,multi_gaussian(x,*popt),'r--',linewidth=1,label='Gaussian1')
plt.plot(x,multi_gaussian2(x,*popt2),'g--',label='Gaussian2')
plt.legend(loc='best')
plt.show()
解决方法
链接的电子表格中的数据的速度和强度只有2个有效数字。这使得基本上不可能尝试“优化”适合度以取得更好的结果。就是说,我强烈建议您使用这样的lmfit脚本,该脚本应包括适合度的强度不确定性:
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import GaussianModel,ConstantModel
data = np.loadtxt('ddata.txt',skiprows=1)
v = data[:,0]
i = data[:,1]
di = data[:,2]
model = (ConstantModel(prefix='offset_') +
GaussianModel(prefix='p1_') +
GaussianModel(prefix='p2_'))
params = model.make_params(offset_c=1,p1_amplitude=-1.,p1_sigma=100,p1_center=25,p2_amplitude=-1.,p2_sigma=100,p2_center=-25)
init = model.eval(params,x=v)
result = model.fit(i,params,weights=1.0/(di+1.e-9),x=v)
print(result.fit_report())
plt.figure()
plt.scatter(v,i,s=0.5,label='data')
plt.plot(v,init,label='init')
plt.plot(v,result.best_fit,label='fit')
plt.legend()
plt.xlabel('velocity (mm/s)')
plt.ylabel('intensity')
plt.show()
对于您提供的数据,将打印出适合的报告,如下所示:
[[Model]]
((Model(constant,prefix='offset_') + Model(gaussian,prefix='p1_')) + Model(gaussian,prefix='p2_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 128
# data points = 191
# variables = 7
chi-square = 654.770994
reduced chi-square = 3.55853801
Akaike info crit = 249.314315
Bayesian info crit = 272.080229
[[Variables]]
offset_c: 1.00013943 +/- 5.1045e-05 (0.01%) (init = 1)
p1_amplitude: -1.36807407 +/- 0.08677931 (6.34%) (init = -1)
p1_center: 46.8019583 +/- 3.77807981 (8.07%) (init = 25)
p1_sigma: 57.3859589 +/- 2.39823612 (4.18%) (init = 100)
p2_amplitude: -1.16999330 +/- 0.08533205 (7.29%) (init = -1)
p2_center: -76.1117581 +/- 3.49975073 (4.60%) (init = -25)
p2_sigma: 51.7080694 +/- 2.08860434 (4.04%) (init = 100)
p1_fwhm: 135.133604 +/- 5.64741436 (4.18%) == '2.3548200*p1_sigma'
p1_height: -0.00951073 +/- 2.6406e-04 (2.78%) == '0.3989423*p1_amplitude/max(2.220446049250313e-16,p1_sigma)'
p2_fwhm: 121.763196 +/- 4.91828727 (4.04%) == '2.3548200*p2_sigma'
p2_height: -0.00902683 +/- 3.5183e-04 (3.90%) == '0.3989423*p2_amplitude/max(2.220446049250313e-16,p2_sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(p1_center,p2_amplitude) = -0.967
C(p1_amplitude,p2_center) = 0.959
C(p1_center,p2_center) = 0.956
C(p1_amplitude,p2_amplitude) = -0.946
C(p1_amplitude,p1_center) = 0.943
C(p2_amplitude,p2_center) = -0.943
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。