如何解决Solr 8.6.3无法索引HTML文件
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<link rel="stylesheet" href="bootstrap.min.css">
<title>Document</title>
<style>
div {
border: darkgreen 1px solid;
}
.li-center {
position: absolute;
left: 50%;
transform: translatex(-50%);
}
nav .navbar-nav li a:hover {
color: pink !important;
}
nav .navbar-nav li a {
color: lime !important;
}
</style>
</head>
<body>
<div class="container">
<div class="container" style="margin:0px;">
<nav class="navbar navbar-expand-md navbar-dark" style="background-color:darkblue;margin:0px;">
<div style="margin:0px;">
<a class="navbar-brand" href="#">
<img src="img/bug.png" height="30" alt="bug logo">
</a>
<span class="mr-auto" style="font-weight: bold;font-size:large;color:white">Instruction</span>
</div>
<button class="navbar-toggler" data-toggle="collapse" data-target="#myMenu">
<span class="navbar-toggler-icon" style="color:lime"></span>
</button>
<div class="collapse navbar-collapse" id="myMenu">
<ul class="navbar-nav ml-auto">
<li class="nav-item">
<a class="nav-link" href="#">Products</a>
</li>
<li class="nav-item active">
<a class="nav-link" href="#">Lessons</a>
</li>
<li class="nav-item">
<a class="nav-link" href="#">Help</a>
</li>
</ul>
</div>
</nav>
</div>
<div class="row">
<div class="col-sm-12 col-md-6 col-lg-3">
Lorem ipsum dolor sit amet consectetur adipisicing elit. Incidunt,dolor.
</div>
<div class="col-sm-12 col-md-6 col-lg-9">
Lorem ipsum dolor sit amet consectetur adipisicing elit. Inventore quo dolor odit magnam explicabo
delectus enim reiciendis quis cum? Ab,officia accusantium totam vel mollitia nemo repellat laudantium
animi,illum modi consequatur,ipsa aperiam blanditiis maiores fuga alias dicta enim dolores deserunt
quidem suscipit fugit! Praesentium voluptatibus distinctio,laudantium adipisci quisquam perspiciatis
consequuntur tempore,dignissimos error blanditiis aliquid nesciunt repellendus? Rerum ea molestias at
repellendus illum veritatis consectetur quas possimus aperiam,itaque explicabo ducimus harum ullam unde
placeat sunt nam laborum minima accusantium,provident hic non sed. Magnam doloribus aliquam,odit
adipisci consequatur eligendi perspiciatis perferendis,quisquam voluptatum eaque ducimus?
</div>
</div>
</div>
</body>
</html>
我具有以下文件夹结构,我的solr版本为8.6.3。 当我输入命令时:
solr/
├── bin/
├── CHANGES.TXT
├── contrib/
├── dist/
├── docs/
├── example/
├── licenses
............
├── server/
└── tempfolder/
└── index.html
我收到以下错误:
Solr针对网址返回了错误#404(未找到): http:// localhost:8983 / solr / solrhelp / update / extract?resource.name = / home / user / solr-8.6.3 / example / my-examples / index.html&literal.id = / home / user / solr- 8.6.3 / example / my-examples / index.html
但是在solr-8.3.1中,此命令可以正常工作。 solr-8.6.3是否支持html文件索引?如果是,该怎么办?
解决方法
您有to enable the ExtractingRequestHandler and configure it可以使用/extract
。这可能已经在您的旧安装中完成了。
如果您不使用示例配置集,则不会自动加载使用Solr Cell所需的jar。您将需要配置solrconfig.xml来找到ExtractingRequestHandler及其依赖项:
<lib dir="${solr.install.dir:../../..}/contrib/extraction/lib" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../..}/dist/" regex="solr-cell-\d.*\.jar" />
然后可以在solrconfig.xml中配置ExtractingRequestHandler。以下是Solr的_default configset中找到的默认配置,您可以根据需要进行修改:
<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="lowernames">true</str> <str name="fmap.content">_text_</str> </lst> </requestHandler>
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。