Google Books Ngrams - 数据集市

点击图片放大查看

资源介绍

N-grams are fixed size tuples of items. In this case the items are words extracted from the Google Books corpus. The n specifies the number of elements in the tuple, so a 5-gram contains five words or characters. The n-grams in this dataset were produced by passing a sliding window of the text of books and outputting a record for each new token.

END

标签

上一篇低上下文名称实体识别（NER）数据集与公报

下一篇 Reddit推荐帖语料库

发表评论取消回复

请先登录账户再评论哦

Google Books Ngrams免费

资源介绍

发表评论取消回复

最新文章

热门文章

老年精神状态

苗圃数据集

维基卡公司

细粒体打字的中国语料库

Reddit推荐帖语料库

标签云

猜你喜欢

Google Books Ngrams免费

资源介绍

发表评论 取消回复

最新文章

热门文章

老年精神状态

苗圃数据集

维基卡公司

细粒体打字的中国语料库

Reddit推荐帖语料库

标签云

猜你喜欢

老年精神状态

苗圃数据集

维基卡公司

细粒体打字的中国语料库

Reddit推荐帖语料库

siim acr气胸-分段.zip数据集

印度进口石油产品的数量

调查数据集第2部分-在需求工程中，将视频作为交流的文档选项

世界银行：GHNP数据

低上下文名称实体识别 （NER） 数据集与公报

发表评论取消回复

低上下文名称实体识别（NER）数据集与公报