读破万卷,神交古人,突破ChatGPT4096的Token限制,建立自己的垂直领域资料人工智能助理

ChatGPT的泛用性极高,上知天文,下通地理,参考古今,博稽中外,几乎无所不知,无所不晓。但如果涉及垂直领域的专业知识点,ChatGPT难免也会有语焉不详,闪烁其词的毛病,本次我们将特定领域的学习材料“喂”给ChatGPT,让它“学习”后再来回答专业问题。

专业领域语料问题

所谓专业领域语料问题,可以理解为特定范围内的知识图谱,也就是给GPT提供前置的检索维度,举个例子,大家都读过鲁迅的名篇《从百草园到三味书屋》,文章中涉及一个“美女蛇”的典故,假设我们没有给GPT设置一个特定范围,直接问“美女蛇”的相关问题:

一望而知,ChatGPT对于“美女蛇”典故的理解出现了信息偏差问题,它以为“美女蛇”指的是《白蛇传》中的白素贞和许仙以及法海的故事。

但其实我们都知道,《从百草园到三味书屋》中“美女蛇”指的是人首蛇身的怪物,能唤人名,倘一答应,夜间便要来吃这人的肉的故事。

所以,如果我们想谈论“美女蛇”相关的问题,必须让ChatGPT有一个特定的“语境”,它才能理解真正要谈论的话题,所以需要把《从百草园到三味书屋》作为语料“喂”给ChatGPT才可以,当然了《从百草园到三味书屋》作为人尽皆知的杂文,它肯定默认存储于ChatGPT的语料库中,但假设如果某一个领域的论文或者其他资料,并未出现在ChatGPT的语料库中,而该文章的长度又超过ChatGPT输入的4096个token的限制,那么就非常麻烦了,所以让ChatGPT具备学习“新材料”的能力就显得十分必要了。

llama_index配置语料索引

LlamaIndex(GPT Index)是一个针对特定语料检索的GPT项目,可以通过索引文件把外部语料数据和GPT连接起来,首先安装项目:

pip3 install llama-index

注意该项目依赖langchain模块,为了确保不出问题,最好升级一下langchain模块:

pip3 install --upgrade langchain

LlamaIndex所做的是将我们的原始语料数据转换成一个基于向量的索引,这对检索来说是非常高效的。它将使用这个索引,根据查询和数据的相似性,找到最相关的部分。然后,它将把检索到的内容插入到它将发送给GPT的引导词(prompt)中,这样GPT就有了回答问题的“语境”:

具体工作流:

将本地答案数据集,转为向量存储到向量数据(index.json)

当用户输入查询的问题时,把问题转为向量然后从向量数据库中查询相近的答案topK 这个时候其实就是我们最普遍的问答查询方案,在没有GPT的时候就直接返回相关的答案整个流程就结束了。

基于GPT可以优化回答内容的整体结构,在单纯的搜索场景下其实这个优化没什么意义。但如果在垂直领域特定的聊天场景下,引用相关领域内容回复时,数据检索会更加精准。

首先把《从百草园到三味书屋》这篇文章写入到项目的data目录中,随后编写代码:

import os  
from llama_index import SimpleDirectoryReader, GPTSimpleVectorIndex,LLMPredictor,ServiceContext  
  
from langchain import OpenAI  
  
os.environ["OPENAI_API_KEY"] = 'apikey'  
  
  
class LLma:  
  
    # 建立本地索引  
    def create_index(self,dir_path="./data"):  
  
  
        # 读取data文件夹下的文档  
        documents = SimpleDirectoryReader(dir_path).load_data()  
  
        index = GPTSimpleVectorIndex.from_documents(documents)  
  
        print(documents)  
  
        # 保存索引  
        index.save_to_disk('./index.json')

这里通过GPTSimpleVectorIndex.from_documents方法读取data目录中的语料文章,随后转换为向量索引存储在本地磁盘的index.json文件中。

执行建立索引方法:

if __name__ == '__main__':  
      
    llma = LLma()  
  
    # 建立索引  
    llma.create_index()

索引的内容:

{"index_struct": {"__type__": "simple_dict", "__data__": {"index_id": "86c83b5a-a975-43ab-8505-cbc8f0ae68e2", "summary": null, "nodes_dict": {"da552579-e0f4-4ee0-be68-a3c392e39dc2": "a2521cfa-13c5-49b2-9cfd-7206fe493666", "c1f7df04-5e6c-4327-a0cc-4a3489d50d19": "68b609e3-2ec5-4de2-ac43-eb28105364ca"}, "doc_id_dict": {"87411099-60d8-4272-a7d1-6e8676fc42a0": ["da552579-e0f4-4ee0-be68-a3c392e39dc2", "c1f7df04-5e6c-4327-a0cc-4a3489d50d19"]}, "embeddings_dict": {"da552579-e0f4-4ee0-be68-a3c392e39dc2": [0.004821529611945152, -0.005787167698144913, 0.00886388961225748, -0.0005273548304103315, -0.0007779211737215519, 0.022242968901991844, -0.0035828494001179934, -0.023534925654530525, -0.03012790158390999, -0.014744291082024574, 0.004718306474387646, -0.0010788505896925926, -0.006236688699573278, 0.0033247910905629396, 0.01692862994968891, 0.02300216071307659, 0.01628931239247322, 0.008157975040376186, 0.028822625055909157, -0.0011337919859215617, 0.006499741692095995, 0.02746407315135002, -0.016302630305290222, -0.002881929511204362, -0.011933951638638973, 0.016502417623996735, 0.03031436912715435, -0.016489099711179733, 0.003935806918889284, -0.0009106963989324868, 0.0039058385882526636, 0.004168891813606024, -0.018566885963082314, -0.00980954896658659, -0.026451818645000458, -0.027490710839629173, -0.008237889967858791, -0.005337646696716547, 0.010009336285293102, 0.0037393493112176657, 0.013931823894381523, 0.0008798958733677864, -0.004105625674128532, 0.011208058334887028, -0.01188733521848917, 0.008311145007610321, -0.020058630034327507, -0.006176752503961325, -0.01582314260303974, 0.026438498869538307, 0.005500806029886007, 0.005507465451955795, -0.028103390708565712, -0.01434471644461155, -0.010175825096666813, 0.011747484095394611, 0.01688867248594761, 0.026558371260762215, -0.010755208320915699, -0.011754143983125687, 0.014970717020332813, 0.017115099355578423, -0.02107088454067707, -0.015317014418542385, -0.020990969613194466, 0.028103390708565712, 0.007438741158694029, -0.040436916053295135, -0.0037460089661180973, -0.0131593132391572, 0.032285600900650024, 0.030554113909602165, 0.005933678243309259, -0.015090588480234146, 0.041422534734010696, -0.01100827194750309, -0.03617479279637337, 0.010488824918866158, -0.010701931081712246, 0.009243485517799854, -0.005277710501104593, -0.01450454629957676, -0.02233620174229145, 0.02344169095158577, 0.01539692934602499, 0.019046373665332794, -0.019206203520298004, 0.04653708636760712, -0.015490163117647171, -0.023175308480858803, 0.012573271058499813, 0.026398541405797005, 0.013179291971027851, 9.521106403553858e-05, 0.018100714311003685, 0.020857777446508408, -0.007119081914424896, 0.013279185630381107, -0.0021826745942234993, -0.0350826233625412, -0.0061501143500208855, -0.009136931970715523, -0.009862825274467468, 0.002357488265261054, -0.023561563342809677, 0.008231230080127716, 0.02282901108264923, 0.00845099613070488, 0.03207249566912651, -0.01539692934602499, -0.007352166809141636, 0.03236551582813263, 0.008677421137690544, -0.04581785202026367, -0.0017514672363176942, -0.026385221630334854, 0.027863647788763046, 0.008018123917281628, -0.016955269500613213, -0.0055873803794384, -0.004668359644711018, 0.01126133557409048, -0.004535167943686247, -0.026385221630334854, -0.0008724038489162922, 0.020697947591543198, -0.011780781671404839, 0.01530369557440281, -0.02426747791469097, -0.013505610637366772, 0.010715250857174397, 0.028662795200943947, 0.017354842275381088, -0.006056880112737417, -0.04019717499613762, 0.0062999543733894825, -0.017195014283061028, -0.003965775016695261, 0.0009897787822410464, -0.02342837303876877, 0.005976965185254812, 0.016822077333927155, -0.012699802406132221, -0.021030927076935768, 0.0061334650963544846, 0.031992580741643906, -9.115288412431255e-05, 0.007465379778295755, -0.011481101624667645, -0.0066828797571361065, 0.02358820289373398, -0.0002526475000195205, 0.03460313379764557, -0.005790497176349163, 0.01036229357123375, 0.013139334507286549, -0.0052244337275624275, 0.011660909280180931, -0.019033055752515793, 0.011667569167912006, 0.024027733132243156, 0.02807675302028656, 0.021803436800837517, -0.011048229411244392, 0.002240945817902684, 0.024760287255048752, 0.004934742581099272, -0.004338710568845272, -0.0006101832259446383, -0.015023993328213692, -0.011694207787513733, 0.013119355775415897, -0.021590329706668854, 0.028263220563530922, 0.003759328043088317, 0.007625209167599678, 0.012107101269066334, 0.015849780291318893, -0.019659055396914482, -0.012972844764590263, -0.04509861767292023, -0.022908926010131836, 0.01881994865834713, 0.01108152698725462, -0.019073013216257095, -0.020458202809095383, -0.012686483561992645, 0.0038159345276653767, -0.0018863235600292683, -0.0006988387904129922, 0.00893048569560051, 0.02617211639881134, -0.019539183005690575, -0.014384673908352852, -0.5932878851890564, -0.011580994352698326, -0.01566331274807453, 0.0027354189660400152, 0.008977102115750313, 0.01149442046880722, 0.006489752326160669, 0.01800748147070408, -0.004032370634377003, -0.04208848997950554, -0.023028798401355743, 0.015050631016492844, 0.005440869834274054, 0.0030251103453338146, -0.01690199226140976, -0.007392124272882938, 0.009263464249670506, -0.011094845831394196, -0.002357488265261054, -0.01977892778813839, -0.008197932504117489, 0.015503481961786747, 0.004951391369104385, 0.007432081736624241, -0.013239228166639805, 0.020951012149453163, 0.012413441203534603, -0.018766671419143677, -0.004964710678905249, 0.034683048725128174, -0.031140156090259552, 0.016196077689528465, 0.013798631727695465, 0.00372935994528234, 0.05791163444519043, -0.005960316397249699, -0.011427824385464191, 0.01466437615454197, 0.027970200404524803, 0.0183271411806345, 0.011274654418230057, -0.00394579628482461, 0.012460058555006981, -0.007025847677141428, -0.0052310931496322155, -0.013918504118919373, 0.025785861536860466, -0.007798358332365751, 0.0028436370193958282, 0.02583913691341877, 0.0004337045829743147, 0.0019362703897058964, -0.00045826175482943654, -0.010415569879114628, 0.023015478625893593, -0.0012678159400820732, 0.018580203875899315, -0.04086313024163246, -0.0015425232704728842, 0.0047849020920693874, 0.016795439645648003, 0.026744838804006577, -0.004138923715800047, -0.010189143940806389, -0.005763859022408724, 0.02454718016088009, -0.04099632054567337, -0.00870405975729227, 0.008690740913152695, -0.0016657252563163638, 0.0038692110683768988, 0.011647590436041355, -0.006389858666807413, 0.0014010072918608785, 0.002547285985201597, 0.022775733843445778, 0.02202986180782318, 0.007245613727718592, -0.007991485297679901, 0.01403837651014328, 0.01580982282757759, -0.003476296318694949, -0.023201946169137955, -0.01768782176077366, 0.010355633683502674, -0.003989083226770163, -0.011387866921722889, 0.0207379050552845, -0.0004757431452162564, -0.009436612948775291, -0.01507726963609457, 0.012506674975156784, -0.004158902447670698, 0.006250008009374142, 0.002142717130482197, 0.008644123561680317, -0.022229649126529694, -0.012100441381335258, 0.009316740557551384, -0.033883899450302124, 0.007305549923330545, -0.04517853260040283, -0.006925954483449459, 0.01164093054831028, -0.0032665198668837547, 0.024853520095348358, -0.014131610281765461, -0.010302357375621796, 0.018233906477689743, -0.0215237345546484, 0.0005473335040733218, 0.0028070092666894197, 0.020697947591543198, -0.0022043180651962757, 0.005677284672856331, -0.019366033375263214, 0.021150799468159676, 0.021643606945872307, 0.03929147124290466, 0.00853757094591856, 0.013132674619555473, 0.02967504970729351, 0.006872677709907293, -0.0004037365142721683, 0.038359131664037704, 0.009429953061044216, -0.02007194794714451, -0.005027976352721453, 0.014211525209248066, 0.00625666743144393, 0.0001508809218648821, -0.014904120936989784, 0.036680918186903, -0.017141737043857574, 0.03657436743378639, -0.017101779580116272, 0.01579650305211544, -0.004821529611945152, 0.02362816035747528, -0.009389995597302914, -0.007771719712764025, 0.016182757914066315, 0.02091105468571186, -0.004601764027029276, -0.009449931792914867, -0.017621226608753204, -0.010542101226747036, -0.014597781002521515, -0.036601003259420395, 0.0007879105396568775, 0.0012811350170522928, -0.007978166453540325, -0.013219249434769154, -0.010974973440170288, -0.01932607591152191, 0.041076235473155975, -0.013618824072182178, -0.0075852517038583755, -0.025439562276005745, 0.01736816205084324, 0.01325254701077938, 0.0026987912133336067, -0.01690199226140976, -0.004954721312969923, 0.004082317464053631, -0.027157733216881752, -0.010661973617970943, -0.005624007899314165, -0.03300483524799347, -0.03396381437778473, -0.010435548610985279, -0.014011737890541553, 0.0018430363852530718, 0.00505794445052743, -0.005011327564716339, -0.015383610501885414, -0.02697126381099224, -0.000325278437230736, -0.019419310614466667, -0.003979093860834837, 0.014211525209248066, 0.0023175308015197515, -0.003479626029729843, -0.0031166793778538704, 0.007725102826952934, -0.02057807520031929, 0.011181420646607876, 0.02759726345539093, 0.01992543786764145, 0.021124159917235374, 0.01269314344972372, 0.02280237339437008, -0.013705397956073284, 0.01612948253750801, -0.012706462293863297, -0.00011789522250182927, 0.030074624344706535, 0.013439015485346317, -0.008457656018435955, 0.03785300254821777, 0.003639455884695053, 0.02488015964627266, 0.0007954025641083717, -0.021590329706668854, 0.011967250145971775, -0.023881223052740097, 0.014850844629108906, -0.016848715022206306, 0.012420101091265678, -0.005164497531950474, -0.001505063148215413, -0.005430880468338728, -0.012346845120191574, -0.02011190541088581, 0.0077783796004951, 0.017168374732136726, -0.012619887478649616, 0.00326818460598588, 0.011953930370509624, -0.002958514727652073, -0.013998419046401978, -0.0045451573096215725, 0.010182484984397888, -0.0003080051683355123, -0.01852692849934101, 0.003238216508179903, 0.0018846587045118213, 0.0013976775808259845, -0.012333526276051998, 0.0011654250556603074, -0.017994161695241928, -0.01493075955659151, 0.006386529188603163, 0.013299164362251759, -0.005633997265249491, 0.01579650305211544, 0.032605260610580444, -0.04030372574925423, 0.0295152198523283, -0.014397993683815002, 0.0036094877868890762, 0.02614547684788704, 0.005800486542284489, -0.02091105468571186, 0.007598571013659239, 0.023827945813536644, 0.007831656374037266, -0.006163433194160461, -0.0167554821819067, 0.024187562987208366, -0.012180356308817863, 0.014864163473248482, -0.018633481115102768, -0.012426760047674179, 0.013339121825993061, -0.046350616961717606, -0.0049214232712984085, 0.022922245785593987, 0.02677147649228573, 0.013359100557863712, 0.003148312447592616, -0.007312209345400333, -0.001261988771148026, 0.007538634818047285, 0.034070368856191635, 0.006126805674284697, -0.0043953172862529755, -0.020618032664060593, -0.02153705433011055, 0.006383199244737625, 0.002515653148293495, -0.020671309903264046, 0.007438741158694029, -0.01085510104894638, 0.017954204231500626, -0.008650783449411392, -0.015103908255696297, -0.009736292995512486, -0.008624144829809666, -0.007771719712764025, -0.030527476221323013, -0.013079398311674595, 0.007378804963082075, 0.007365486118942499, -0.00893714465200901, -0.0028602860402315855, -0.0199387576431036, -0.005534104071557522, -0.00011040320532629266, 0.030900411307811737, -0.014531184919178486, -0.0036061578430235386, 0.007025847677141428, 0.005171157419681549, -0.010568739846348763, 0.00614345446228981, 0.008351102471351624, -0.006686209701001644, -0.015130545943975449, 0.005763859022408724, -0.011607632972300053, -0.015876417979598045, -0.016555694863200188, 0.008231230080127716, 0.024360712617635727, 0.016022928059101105, -0.02007194794714451, 0.0009631405118852854, -0.0019928766414523125, -0.0016507413238286972, -0.03409700468182564, -0.01660897210240364, -0.014171567745506763, -0.011594314128160477, -0.00028844267944805324, 0.01628931239247322, -0.012713122181594372, -0.02262922376394272, 0.041742194443941116, 0.00034400849835947156, -0.006789433304220438, -0.025253094732761383, -0.0335376001894474, 0.01283299457281828, 0.08353766053915024, -0.001618275884538889, -0.012972844764590263, 0.003762657754123211, 0.034523218870162964, 0.006632933393120766, -0.025745904073119164, -0.03718704730272293, 0.022456074133515358, 0.0007313041714951396, 0.007198996841907501, -0.006799422670155764, 0.014397993683815002, -0.004981359466910362, 0.004049019422382116, -0.01165425032377243, -0.006802752148360014, -0.014810887165367603, 0.0077783796004951, 0.0004247557953931391, -0.0016623955452814698, 0.010468846186995506, -0.00013964288518764079, 0.032631900161504745, -0.017821013927459717, 0.0022592595778405666, 0.02378798834979534, 0.015849780291318893, 0.03494942933320999, -0.017314886674284935, -0.014850844629108906, -0.017181694507598877, 0.006523050367832184, 0.004262125585228205, -0.013945142738521099, -0.006729497108608484, -0.00812467746436596, 0.01820726878941059, -0.004045689478516579, -0.011993887834250927, 0.02677147649228573, 0.044113002717494965, 0.026358583942055702, -0.04102296009659767, 0.016835397109389305, -0.005727231502532959, -0.0067494758404791355, 0.0032765092328190804, -0.017900928854942322, -0.007172358222305775, 0.035002708435058594, -0.01347897294908762, 0.011028250679373741, -0.020325012505054474, 0.009683016687631607, -0.006852698978036642, 0.005334316752851009, -0.00701918825507164, -0.0030833815690129995, 0.009456591680645943, 0.004468572326004505, -0.028529604896903038, 0.0035695303231477737, -0.010555421002209187, -0.018074076622724533, -0.02075122483074665, 0.014637738466262817, -0.0065896459855139256, -0.006609624717384577, -0.003095035906881094, 0.001224528648890555, 0.0011879010125994682, -0.028156667947769165, -0.03092704899609089, 0.016995226964354515, 0.014051695354282856, 0.0302877314388752, 0.006100167520344257, -0.01451786607503891, 0.008490953594446182, -0.00028240744723007083, -0.02300216071307659, 0.01491743978112936, -0.018580203875899315, -0.015277056954801083, 0.025146542116999626, 0.0167288426309824, -0.003922487609088421, -0.0006946765352040529, 0.001029736245982349, 0.020657990127801895, 0.006423156708478928, 0.014557823538780212, 0.005943667609244585, 0.019885480403900146, -0.019099650904536247, -0.012912909500300884, 0.01931275799870491, 0.00634657172486186, -0.016302630305290222, -0.0021543714683502913, -0.014477908611297607, -0.007578592281788588, -0.013019462116062641, 0.02409433014690876, -0.01786097139120102, 0.0023724723141640425, 0.0010230767074972391, -0.011414505541324615, 0.005157838109880686, 0.014544503763318062, -0.018447013571858406, 0.009842846542596817, -0.0019346055341884494, 0.002054477808997035, 0.021776799112558365, -0.005254401825368404, 0.035668663680553436, 0.0034463282208889723, -0.017674501985311508, 0.02422752045094967, -0.018220586702227592, 0.014904120936989784, 0.021350586786866188, -0.03537564352154732, 0.029089007526636124, -0.01036895252764225, 0.003919157665222883, -0.015050631016492844, -0.0009240155341103673, 0.018606843426823616, 0.02458713762462139, 0.008197932504117489, 0.00041310154483653605, -0.03684074804186821, -0.01362548302859068, 0.005011327564716339, -0.004335381090641022, -0.006353231146931648, 0.007705124095082283, -0.03207249566912651, -0.016236035153269768, -0.0016457466408610344, -0.012460058555006981, 0.04067666083574295, -0.02550615929067135, -0.015250418335199356, -0.01189399417489767, 0.02200322411954403, 0.00554076349362731, -0.006040231324732304, -0.004641721490770578, -0.0036827430594712496, 0.023494968190789223, 0.009037038311362267, -0.01818062923848629, -0.014424631372094154, -0.007838315330445766, 0.015996290370821953, -0.018873225897550583, 0.03223232552409172, -0.01977892778813839, 0.011933951638638973, 0.013212589547038078, -0.009889463894069195, 0.00219432869926095, -0.026052244007587433, -0.00630328431725502, -0.037080492824316025, 0.022722458466887474, 0.008957123383879662, 0.023401733487844467, -0.00489811459556222, -0.01036229357123375, -0.012899589724838734, 0.04368678852915764, -0.010002676397562027, 0.014957397244870663, -0.029115647077560425, -0.04528508707880974, -0.008477634750306606, -0.013279185630381107, -0.00350959412753582, 0.00916357059031725, 0.012453398667275906, -0.019738970324397087, 0.014318078756332397, -0.011541036888957024, 0.019818885251879692, -0.04096968472003937, 0.03156637027859688, -0.03950457647442818, 0.03652108833193779, -0.01362548302859068, 0.006879337131977081, -0.006253337487578392, 0.0039025088772177696, -0.013865227811038494, -0.010195803828537464, 0.014810887165367603, 0.015183823183178902, -0.008570868521928787, -0.004581785295158625, -0.016688885167241096, 0.006686209701001644, -0.0007421259651891887, 0.00630328431725502, -0.008277847431600094, 0.012866292148828506, -0.014411312527954578, -0.009356698021292686, -0.014251482672989368, -0.0065396991558372974, -0.033883899450302124, 0.009609761647880077, -0.00012236963084433228, -0.016688885167241096, 0.02791692316532135, -0.005241082515567541, -0.011234696954488754, 0.0043154023587703705, 0.007292230613529682, 0.04765589162707329, -0.0008665767381899059, 0.018087396398186684, 0.03095368854701519, -0.030873773619532585, -0.014437951147556305, -0.02839641273021698, 0.025412924587726593, -0.005134529434144497, 0.02217637374997139, 0.032312240451574326, -0.011754143983125687, 0.003321461146697402, 0.02585245668888092, 0.00653636921197176, -0.030261091887950897, -0.04062338545918465, 0.0231353510171175, 0.027037860825657845, -0.004771582782268524, -0.00990278273820877, -0.01770114153623581, 0.0027304242830723524, 0.007771719712764025, -0.010582058690488338, -0.0053975824266672134, -0.052717167884111404, -0.006389858666807413, -0.018060756847262383, -0.0020661321468651295, -0.01022244244813919, 0.01786097139120102, 0.007099103182554245, -0.015516801737248898, -0.020671309903264046, -0.023894542828202248, -0.016222715377807617, 0.027037860825657845, 0.0029734985437244177, 0.05205120891332626, -0.002831982681527734, -0.009569804184138775, 0.013059419579803944, 0.010595378465950489, -0.010841782204806805, -0.023641478270292282, -0.028103390708565712, 0.05466176196932793, -0.0009573134011588991, 0.00041539076482877135, -0.026758158579468727, 0.028636157512664795, 0.017581269145011902, 0.010628676041960716, 0.01720833219587803, -0.005707252770662308, 0.002517318120226264, -0.0021510415244847536, 0.013785312883555889, 0.038865260779857635, -0.002052812837064266, -0.002873605117201805, -0.017794374376535416, -0.00957646407186985, -0.0023641479201614857, 0.0070591457188129425, 0.013538909144699574, -0.0006697031785733998, -0.0087706558406353, -0.003339775139465928, -0.014051695354282856, 0.013265866786241531, -0.01576986536383629, -0.01163427159190178, -0.02220301143825054, 0.03002134896814823, -0.04288763925433159, 0.009057017043232918, -0.0036660940386354923, -0.004205519333481789, -0.0159430131316185, -0.005670625250786543, 0.022935563698410988, -0.015210460871458054, 0.006509731058031321, -0.017514672130346298, -0.020031990483403206, -0.035801857709884644, 0.012586589902639389, 0.007944868877530098, 0.003982423804700375, -0.016502417623996735, 0.05324993282556534, -0.0020128553733229637, -0.0006543028866872191, -0.018593523651361465, -0.02677147649228573, 0.0036527749616652727, -0.00494140200316906, 0.0023058766964823008, -0.013665440492331982, -0.03063402883708477, 0.024800244718790054, -0.007465379778295755, -0.009716315194964409, 0.004335381090641022, -0.047149766236543655, -0.013512270525097847, -0.006250008009374142, 0.013998419046401978, -0.010622016154229641, -0.03159300610423088, -0.028263220563530922, -0.013745355419814587, -0.02314867079257965, 0.013598845340311527, -0.01253331359475851, 0.00821791123598814, 0.03148645535111427, 0.029595134779810905, -0.01644914224743843, -0.009516527876257896, 0.006356561090797186, 0.0039025088772177696, -0.0036827430594712496, -0.017301566898822784, -0.05940337851643562, -0.019086331129074097, -0.024027733132243156, 0.03465640917420387, -0.0015058956341817975, -0.003699391847476363, -0.02422752045094967, 0.010721909813582897, -0.01142116542905569, -0.019579140469431877, 0.006353231146931648, 0.005917029455304146, 0.015503481961786747, -0.02394781820476055, 0.015317014418542385, 0.0059037101455032825, -0.015343653038144112, 0.015903057530522346, -0.02297552116215229, -0.00021196166926529258, -0.015290375798940659, -0.01866011880338192, 0.008411038666963577, 0.011294633150100708, -0.041236065328121185, -0.013265866786241531, 0.002547285985201597, -0.009556485339999199, 0.0018330470193177462, 0.010428888723254204, -0.021936628967523575, -0.0010730234207585454, -0.006289965007454157, -0.007844975218176842, -0.010428888723254204, -0.010229101404547691, -0.001442629611119628, -0.0191262885928154, -0.02730424329638481, 0.0004973867326043546, 0.003506264416500926, -0.024826882407069206, -0.003839242970570922, 0.011381207965314388, -0.009916101582348347, 0.012773058377206326, 0.007498677354305983, -0.026598328724503517, 0.0016207732260227203, 0.006283305585384369, 0.020791182294487953, -0.012093781493604183, -0.02041824534535408, 0.0155567592009902, -0.013052759692072868, -0.02614547684788704, -0.03201922029256821, 9.323399717686698e-05, -0.026997903361916542, 0.016995226964354515, -0.0034629772417247295, -0.018740033730864525, 0.03937138617038727, -0.034842878580093384, 0.027863647788763046, -0.02252267114818096, -0.006556347943842411, -0.006999209523200989, -0.0035262431483715773, -0.026385221630334854, 0.03590840846300125, 0.042701173573732376, -0.004525178577750921, 0.006010263226926327, -0.012147058732807636, -0.015370290726423264, -0.0018064087489619851, -0.0287160724401474, -0.016009610146284103, -0.01752799190580845, -0.0037693174090236425, 0.004391987342387438, 0.01852692849934101, -0.020205140113830566, -0.028529604896903038, 0.018740033730864525, -0.02204318158328533, -0.03132662549614906, 0.3066599369049072, -0.004924753215163946, 0.007798358332365751, 0.017807694151997566, 0.02519981749355793, 0.016022928059101105, 0.026704881340265274, 0.007938208989799023, -0.006130135618150234, 0.01720833219587803, -0.041422534734010696, 0.00011425327102188021, -0.011840717867016792, 0.003399711335077882, 0.002635525306686759, -0.04038364067673683, -0.032285600900650024, -0.011714186519384384, -0.009236825630068779, -0.050959039479494095, 0.018074076622724533, -0.010422229766845703, -0.014371355064213276, -0.028183305636048317, 0.013838589191436768, 0.0017980842385441065, -0.00925680436193943, -0.019112970679998398, 0.022868968546390533, -0.008078060112893581, -0.013272525742650032, -0.003459647297859192, -0.0039957426488399506, -0.01276639848947525, -0.017967524006962776, 0.004728295840322971, 0.03273845463991165, 0.015703270211815834, 0.021030927076935768, 0.028023475781083107, 0.030181176960468292, -0.025239774957299232, 0.007159039378166199, 0.0033381101675331593, -0.009263464249670506, 0.026678243651986122, -0.013505610637366772, -0.002660498721525073, 0.0001414118305547163, 0.01897977851331234, -0.010728569701313972, -0.010721909813582897, 0.01756794936954975, 0.04912099987268448, 0.013905185274779797, 0.00015264986723195761, 0.014970717020332813, -0.00027429108740761876, -0.012320207431912422, 0.01204716507345438, -0.01389186643064022, 0.0478956364095211, -0.0013610499445348978, 0.017301566898822784, -0.01996539533138275, 0.006553018465638161, -0.030341006815433502, -0.015370290726423264, 0.0063732098788022995, -0.014211525209248066, -0.005467507988214493, -0.02996807172894478, -0.013219249434769154, 0.0033797326032072306, -0.022056501358747482, -0.012813015840947628, 0.021883351728320122, 0.003706051502376795, 0.037080492824316025, 0.017940886318683624, -0.008031442761421204, -0.002275908598676324, -0.011254675686359406, -0.011614292860031128, -0.012813015840947628, -0.032924920320510864, 0.02983487956225872, -0.01723497174680233, -0.03140654042363167, -0.010974973440170288, 0.006496411748230457, 0.018087396398186684, -0.005051285028457642, -0.01642250269651413, 0.020684629678726196, 0.02185671404004097, 0.01723497174680233, 0.030394284054636955, -0.023015478625893593, -0.0003858389100059867, -0.03255198523402214, -0.0143580362200737, 0.015290375798940659, 0.01022244244813919, -0.0175413116812706, 0.01077518705278635, 0.004774912726134062, -0.0011862361570820212, 0.0024607116356492043, -0.009636400267481804, -0.002936871023848653, -0.03002134896814823, -0.0007238121470436454, -0.00029447791166603565, 0.00948988925665617, -0.0034496579319238663, -0.004292093683034182, -0.0043953172862529755, 0.027357518672943115, -0.003403041046112776, 0.008024783805012703, -0.021417181938886642, -0.012040505185723305, 0.022935563698410988, -0.012093781493604183, -0.026092201471328735, -0.017927566543221474, 0.010928357020020485, 0.023561563342809677, -0.03977096080780029, 0.021883351728320122, -0.04272780939936638, 0.024453945457935333, 0.007785039022564888, -0.009636400267481804, 0.02839641273021698, 0.010328995063900948, -0.011740824207663536, -0.01020246371626854, -0.02378798834979534, 0.0024640413466840982, -0.00466502970084548, 0.0036561046727001667, -0.02040492743253708, 0.025719264522194862, -0.009030378423631191, 0.018313821405172348, -0.02218969166278839, 0.04515189304947853, -0.026105519384145737, -0.032924920320510864, 0.01325254701077938, -0.006406507920473814, -0.012320207431912422, 0.013039440847933292, -0.02518649958074093, -0.03433674946427345, -0.0075719328597188, 0.0070658051408827305, 0.029941434040665627, -0.03383062407374382, -0.008724038489162922, 0.03633462265133858, -0.009776250459253788, -0.006872677709907293, 0.0005144518800079823, -0.16803430020809174, 0.009123613126575947, 0.02697126381099224, -0.0199121180921793, 0.02502666972577572, -0.0018447012407705188, -0.004115615040063858, 0.003922487609088421, -0.009676357731223106, 0.006296624895185232, 0.01977892778813839, 0.002302546752616763, 0.006170093081891537, -0.019738970324397087, -0.023161988705396652, 0.028050115332007408, -0.03415028378367424, -0.011501080356538296, 0.04504534229636192, 0.019419310614466667, -0.006259997375309467, -0.006689539644867182, 0.006925954483449459, -0.0022675839718431234, 0.016675567254424095, 0.023255223408341408, 0.014890802092850208, -0.0077517409808933735, -0.020657990127801895, -0.03223232552409172, -0.007838315330445766, -0.0026704880874603987, -0.0053842635825276375, -0.024627095088362694, 0.0447256825864315, 0.008504272438585758, 0.0029268816579133272, 0.016302630305290222, -0.02342837303876877, 0.009156910702586174, 0.03332449495792389, 0.016009610146284103, 0.007432081736624241, 0.0002559772692620754, -0.025133222341537476, 0.023375095799565315, 0.016382545232772827, 0.0051611680537462234, 0.0046184128150343895, 0.00013454415602609515, -0.006170093081891537, -0.0015799832763150334, 0.025972329080104828, 0.01373203657567501, 0.014890802092850208, 0.013545568101108074, -0.002442397875711322, 0.02298884093761444, -0.0018197279423475266, 0.028423050418496132, 0.004468572326004505, -0.024014415219426155, 0.005673954728990793, 0.008457656018435955, 0.010915037244558334, 0.00257392437197268, -0.0010979968355968595, -0.0011554356897249818, -0.022282926365733147, 0.011367888189852238, -0.010142527520656586, -0.03300483524799347, 0.03417691960930824, 0.008810613304376602, 0.0069659119471907616, 0.014397993683815002, -0.004691667854785919, -0.013139334507286549, 0.0017248289659619331, 0.01975228823721409, -0.012393462471663952, 0.01624935492873192, -0.03622806817293167, -0.014970717020332813, -0.002856956096366048, 0.010548761114478111, 0.018859906122088432, -0.016142800450325012, 0.012246951460838318, -0.0022908926475793123, 0.003909168299287558, -0.024360712617635727, -0.018540246412158012, -0.025239774957299232, -0.004345370456576347, 0.023641478270292282, 0.0017448076978325844, -0.004178881179541349, -0.006086848210543394, -0.022602586075663567, 0.017288247123360634, -0.004325391724705696, -0.0067861033603549, 0.022123096510767937, -0.003942466340959072, -0.0055274441838264465, -0.0004255882522556931, 0.024853520095348358, 0.03745343163609505, -0.008231230080127716, -0.010189143940806389, 0.007731762249022722, 0.027543988078832626, 0.01980556547641754, -0.01475760992616415, 0.011114824563264847, -0.0191529281437397, -0.008923825807869434, 0.025466201826930046, -0.006566337309777737, -0.017994161695241928, 0.023175308480858803, 0.003198259277269244, 0.009709655307233334, -0.008810613304376602, -0.006839379668235779, -0.08795961737632751, 0.0034829559735953808, 0.026904668658971786, 0.018233906477689743, -0.015903057530522346, -0.005890390835702419, -0.006053550634533167, 0.031672921031713486, -0.01267982367426157, 0.023028798401355743, -0.01566331274807453, -0.019898800179362297, 0.0012120420578867197, -0.007525315508246422, 0.006779443938285112, -0.006200061179697514, 0.035668663680553436, -0.017181694507598877, -0.008943804539740086, 0.028209945186972618, -0.005860422737896442, -0.006170093081891537, 0.00024723660317249596, -0.008590847253799438, -0.0019379352452233434, -0.006932613905519247, -0.021244032308459282, 0.020951012149453163, 0.009210187010467052, 0.010189143940806389, 0.022775733843445778, -0.002665493404492736, -0.01434471644461155, -0.014104972593486309, 0.010948335751891136, -0.013359100557863712, -0.015210460871458054, 0.009729634039103985, 0.03092704899609089, -0.03223232552409172, -0.011048229411244392, 0.011361229233443737, 0.021297309547662735, -0.02028505504131317, 0.01608952507376671, -0.0011296297889202833, -0.004608423449099064, 0.03417691960930824, 0.00609683757647872, -0.012786377221345901, -0.027837008237838745, -0.012759738601744175, -0.0013568876311182976, 0.0010080926585942507, 0.04845504090189934, -0.006259997375309467, 0.009176889434456825, 0.026917988434433937, -0.014890802092850208, 0.010875079780817032, -0.009370016865432262, -0.012506674975156784, -0.005917029455304146, -0.01929943822324276, 0.016475779935717583, 0.014491227455437183, 0.019858842715620995, 0.00032111621112562716, 0.013372419402003288, -0.01578318513929844, -0.021017607301473618, -0.0017448076978325844, -0.012253611348569393, 0.01736816205084324, -0.023161988705396652, -0.013265866786241531, -0.007272251881659031, 0.00940331444144249, 0.009370016865432262, -0.03428347408771515, -0.000847430492285639, -0.01720833219587803, 0.0069725713692605495, -0.03209913522005081, 0.02408101037144661, 0.012140398845076561, 0.02246939390897751, -0.002715440234169364, 0.008297826163470745, -0.02233620174229145, -0.026371903717517853, -0.010701931081712246, -0.003452987875789404, -0.02328186109662056, 0.014730972237884998, 0.009276783093810081, 0.00016742579464334995, -0.011767462827265263, 0.003542891936376691, 0.010868420824408531, -0.0013235898222774267, -0.03782636672258377, -0.06483758985996246, -0.0005236088181845844, 0.007318869233131409, -0.0037393493112176657, 0.004421955440193415, -0.013705397956073284, 0.0135722067207098, -0.008457656018435955, -0.021577011793851852, 0.012187016196548939, -0.02679811604321003, 0.0271444134414196, 0.005803816486150026, -0.010648654773831367, -0.01491743978112936, -0.004651710856705904, -0.0008915501530282199, 0.004045689478516579, 0.0025323019362986088, 0.024174245074391365, 0.008983762003481388, 0.014864163473248482, 0.0026421849615871906, -0.0014834195608273149, -0.02983487956225872, 0.01124135684221983, -0.03225896507501602, 0.03209913522005081, -0.01899309828877449, -0.024507222697138786, 0.0034463282208889723, -0.0223495215177536, 0.010275718756020069, 0.009483229368925095, -0.004025710746645927, -0.011048229411244392, 0.02187003195285797, 0.008763995952904224, 0.01724828965961933, 0.05551418662071228, -0.05010661482810974, -0.02139054425060749, 0.024294117465615273, 0.00426878547295928, -0.01690199226140976, 0.015090588480234146, -0.012087122537195683, 0.037266962230205536, 0.03921155631542206, -0.019059693440794945, 0.020631352439522743, 0.02551947720348835, -0.010522122494876385, -0.009669697843492031, 0.013618824072182178, -0.021257352083921432, 0.00701918825507164, 0.002300882013514638, -0.005950327031314373, -0.038225941359996796, 0.02855624258518219, 0.011228037066757679, 0.007105762604624033, -0.023881223052740097, 0.00813133642077446, -0.02618543431162834, -0.020937692373991013, -0.008823932148516178, 0.015117227099835873, -0.03446994349360466, -0.012060483917593956, -0.0063798693008720875, 0.019033055752515793, 0.0319393053650856, -0.009962718933820724, 0.014131610281765461, -0.00048490005428902805, -0.025173179805278778, -0.01708845980465412, 0.023042116314172745, -0.003646115306764841, 0.012686483561992645, -0.023015478625893593, 0.0004711646761279553, 0.019352715462446213, 0.02440067008137703, -0.0012003877200186253, 0.013705397956073284, 0.006569667253643274, 0.022575946524739265, 0.028636157512664795, 0.012013866566121578, -0.008497613482177258, 0.006223369389772415, -0.001107986201532185, 0.016502417623996735, -0.01165425032377243, 0.002802014583721757, 0.00662627350538969, 0.03175283595919609, 0.001671552425250411, 0.010235761292278767, -0.0048348489217460155, -0.016862034797668457, -0.00415224302560091, -0.007245613727718592, -0.025253094732761383, -0.033910539001226425, 0.004029040690511465, 0.015436886809766293, -0.02378798834979534, 0.005037965718656778, 0.022615903988480568, 0.03164628520607948, -0.003809274872764945, 0.010408909991383553, 0.007192336954176426, -0.013012802228331566, -0.026052244007587433, 0.04256797954440117, 0.015583396889269352, 0.010741888545453548, 0.019659055396914482, -0.015023993328213692, -0.0016032918356359005, 0.022282926365733147, 0.01627599261701107, 0.002553945640102029, 0.0021227383986115456, -0.022535989060997963, -0.0041489130817353725, -0.015610035508871078, -0.029568497091531754, -0.006499741692095995, 0.004731625318527222, -0.011214718222618103, 0.013299164362251759, 0.015756545588374138, -0.009789570234715939, 0.01403837651014328, 0.01627599261701107, -0.010668633505702019, 0.014238163828849792, 0.013811951503157616, 0.018566885963082314, 0.012613228522241116, 0.0030617378652095795, -0.004944731947034597, -0.003779306774958968, 0.018247226253151894, -0.02855624258518219, 0.039318110793828964, -0.025412924587726593, -0.051758188754320145, 0.0010222442215308547, -0.01770114153623581, 0.01419820636510849, -0.025119904428720474, 0.015356971882283688, 0.026358583942055702, -0.02060471475124359, 0.03337777033448219, -0.0050013381987810135, -0.03393717482686043, 0.001778105623088777, 0.01868675835430622, -0.010895058512687683, 2.286626295244787e-05, -0.02027173526585102, 0.011840717867016792, 0.023574883118271828, -0.012240292504429817, -0.0032748442608863115, 0.022376159206032753, -0.01643582247197628, 0.001834711991250515, 0.01980556547641754, 0.030554113909602165, -0.008970443159341812, 0.0013851908734068274, -0.006173422560095787, -0.019232843071222305, -0.019872160628437996, 0.028662795200943947, -0.0012578265741467476, -0.015863100066781044, 0.008164634928107262, -0.026105519384145737], "c1f7df04-5e6c-4327-a0cc-4a3489d50d19": [-0.006850742734968662, -0.014677336439490318, 0.005485869012773037, -0.016431232914328575, -0.009540928527712822, 0.019451098516583443, -0.016431232914328575, -0.019160980358719826, -0.03536802902817726, -0.019160980358719826, 0.03251959756016731, 0.016826847568154335, 0.009989292360842228, 0.003117120824754238, 0.01025303639471531, 0.01136075984686613, 0.028853559866547585, 0.009409056045114994, 0.041381385177373886, -0.012626729905605316, 0.012257488444447517, 0.01395204197615385, -0.014796021394431591, -0.00324569595977664, -0.020070895552635193, 0.009929950349032879, 0.03204485774040222, -0.013899292796850204, 0.00922443624585867, -0.004529797937721014, 0.01788182370364666, 0.010549748316407204, -0.01956978254020214, -0.008472766727209091, -0.022009411826729774, -0.02038738876581192, -0.009316746145486832, -0.010279410518705845, 0.01516526285558939, -0.010780523531138897, 0.010648651979863644, 0.00802440196275711, -0.0017505987780168653, 0.004740792792290449, -0.019451098516583443, 0.013226746581494808, -0.012395953759551048, -0.014967454597353935, -0.005555101670324802, 0.031359124928712845, 0.007661754265427589, 0.008327707648277283, -0.015468567609786987, -0.0025451267138123512, -0.005429823417216539, -0.008182648569345474, 0.01875217631459236, 0.04773760959506035, -0.024251233786344528, -0.018382936716079712, 0.010859646834433079, 0.004559469409286976, -0.011248668655753136, -0.0110376738011837, -0.01768401451408863, 0.024000676348805428, -0.0095475222915411, -0.034682296216487885, -0.0009964567143470049, 0.0009791484335437417, 0.0027561215683817863, 0.026097439229488373, -0.006534249987453222, 0.004753980319947004, 0.06361497938632965, -0.004803432151675224, -0.02658536471426487, -0.0058320327661931515, -0.008657386526465416, -0.0009090915555134416, -0.00871672946959734, -0.01558725256472826, -0.0339701883494854, 0.025701822713017464, 0.010081603191792965, 0.017116965726017952, 0.008598044514656067, 0.03906044363975525, -0.02438310533761978, -0.02145555056631565, 0.006165008991956711, 0.01340477354824543, 0.037161488085985184, 0.016721351072192192, 0.010912396013736725, 0.01949065923690796, 0.003969342447817326, 0.011901434510946274, -0.010279410518705845, -0.031781114637851715, -0.014044351875782013, -0.0017044437117874622, -0.012316830456256866, -0.014848770573735237, -0.0014654259430244565, -0.0162466112524271, 0.01540263183414936, -0.019609343260526657, 0.005743019282817841, 0.005202344618737698, 0.01802688278257847, -0.0066826059482991695, -0.01414984930306673, 0.005370480939745903, -0.015679562464356422, -0.015705937519669533, 0.030963510274887085, -0.008222210220992565, -0.017960945144295692, 0.020941250026226044, -0.0187258031219244]}, "text_id_to_doc_id": {"da552579-e0f4-4ee0-be68-a3c392e39dc2": "87411099-60d8-4272-a7d1-6e8676fc42a0", "c1f7df04-5e6c-4327-a0cc-4a3489d50d19": "87411099-60d8-4272-a7d1-6e8676fc42a0"}}}}

llama_index根据向量索引进行语义化查询

llama_index可以在本地匹配向量索引后再构建提示词:

class LLma:  
  
    def query_index(self,prompt,index_path="./index.json"):  
  
        # 加载索引  
        local_index = GPTSimpleVectorIndex.load_from_disk(index_path)  
        # 查询索引  
        res = local_index.query(prompt)  
  
        print(res)

通过GPTSimpleVectorIndex.load_from_disk方法将向量索引导入,执行方法:

if __name__ == '__main__':  
      
    llma = LLma()  
  
    # 建立索引  
    #llma.create_index()  
  
    # 查询索引  
    llma.query_index("讲一下美女蛇的故事")

程序返回:

美女蛇的故事可以追溯到古庙里的一个读书人。晚间,他在院子里纳凉的时候,突然听到有人在叫他的名字。他四面看时,却见一个美女的脸露在墙头上,向他一笑,隐去了。他很高兴,但竟给那走来夜谈的老和尚识破了机关,说他脸上有些妖气,一定遇见“美女蛇”了。

可以看到,“美女蛇”的故事终于是我们想要的“美女蛇”的故事了。

llama_index模型定制化

llama_index默认的答案生成模型方案为text-davinci-002,我们也可以定制化适合自己的模型配置:

class LLma:  
  
    def __init__(self) -> None:  
          
        self.llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003",max_tokens=1800))  
        self.service_context = ServiceContext.from_defaults(llm_predictor=self.llm_predictor)  
  
    # 查询本地索引  
    def query_index(self,prompt,index_path="./index.json"):  
  
        # 加载索引  
        local_index = GPTSimpleVectorIndex.load_from_disk(index_path)  
        # 查询索引  
        res = local_index.query(prompt)  
  
        print(res)  
  
  
    # 建立本地索引  
    def create_index(self,dir_path="./data"):  
  
  
        # 读取data文件夹下的文档  
        documents = SimpleDirectoryReader(dir_path).load_data()  
  
        index = GPTSimpleVectorIndex.from_documents(documents,service_context=self.service_context)  
  
        print(documents)  
  
        # 保存索引  
        index.save_to_disk('./index.json')  
  
  
if __name__ == '__main__':  
      
    llma = LLma()

这里通过初始化函数定制self.llm_predictor属性,生成本地向量索引时,通过service_context参数进行动态调用即可:index = GPTSimpleVectorIndex.from_documents(documents,service_context=self.service_context) 。

结语

藉此,我们就可以通过垂直领域语料来“定制化”ChatGPT的回答了,最后奉上项目地址:github.com/zcxey2911/llama_index_examples_python3.10,与君共觞。

文章出处登录后可见!

已经登录?立即刷新

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
心中带点小风骚的头像心中带点小风骚普通用户
上一篇 2023年6月10日
下一篇 2023年6月10日

相关推荐