ansys.tools.meilisearch.create_indexes#
Create an index for each public GitHub page for each repository in one or more organizations using Sphinx.
Functions#
|
Get all public GitHub pages (gh_pages) for each repository in one or more organizations. |
|
Get URLs for pages that were generated using Sphinx. |
|
Create an index for each public GitHub page that was generated using Sphinx. |
|
Scrape a web page and index its content in Meilisearch. |
Module Contents#
- ansys.tools.meilisearch.create_indexes.get_public_urls(orgs)#
Get all public GitHub pages (gh_pages) for each repository in one or more organizations.
- ansys.tools.meilisearch.create_indexes.get_sphinx_urls(urls)#
Get URLs for pages that were generated using Sphinx.
- ansys.tools.meilisearch.create_indexes.create_sphinx_indexes(sphinx_urls, stop_urls=None, meilisearch_host_url=None, meilisearch_api_key=None)#
Create an index for each public GitHub page that was generated using Sphinx.
The unique name created for the index (
index_uid
) matches
, with a-sphinx-docs '-'
instead of a'/'
in the repository name. For example, the unique ID created for thepyansys/pymapdl
repository haspyansys-pymapdl-sphinx-docs
as its unique name.The unique name for an index is always lowercase.
- Parameters:
- sphinx_urls
dict
Dictionary where keys are repository names that use Sphinx and values are their URLs.
- stop_urls
str
orlist
[str
], default:None
A list of stop points when scraping URLs. If specified, crawling will stop when encountering any URL containing any of the strings in this list.
- meilisearch_host_url
str
, default:None
URL for the Meilisarch host.
- meilisearch_api_key
str
, default:None
API key (admin) for the Meilisearch host.
- sphinx_urls
Notes
This method requires that the
GH_PUBLIC_TOKEN
environment variable be a GitHub token with public access.
- ansys.tools.meilisearch.create_indexes.scrap_web_page(index_uid, url, templates, stop_urls=None, meilisearch_host_url=None, meilisearch_api_key=None)#
Scrape a web page and index its content in Meilisearch.
- Parameters:
- index_uid
str
Unique name to give to the Meilisearch index.
- url
str
URL of the web page to scrape.
- templates
str
orlist
[str
] One or more templates to use to know what content is to be scraped. Available templates are
sphinx_pydata
anddefault
.- stop_urls
str
orlist
[str
], default:None
A list of stop points when scraping URLs. If specified, crawling will stop when encountering any URL containing any of the strings in this list.
- meilisearch_host_url
str
, default:None
URL for the Meilisarch host.
- meilisearch_api_key
str
, default:None
API key (admin) for the Meilisearch host.
- index_uid