DOM-based keyword extraction from web pages

Shah, Himat; Rezaei, Mohammad; Fränti, Pasi

dc.contributor.author	Shah, Himat
dc.contributor.author	Rezaei, Mohammad
dc.contributor.author	Fränti, Pasi
dc.contributor.editor	Tavares, João Manuel RS
dc.date.accessioned	2020-02-10T11:44:26Z
dc.date.available	2020-02-10T11:44:26Z
dc.date.issued	2019
dc.identifier.uri	https://erepo.uef.fi/handle/123456789/8021
dc.description.abstract	We present D-rank, an unsupervised, language and domain independent method for automatically extracting keywords from a single web page. The method does not use any corpus, and relies only on the information and features on the web page including page URL, word frequency, title, hyperlinks, and headers, which are extracted from DOM tree of the page. Different scores are assigned to the words according to their importance that is specified by their positions in the web page. Experimental results on web pages in three different languages show the effectiveness of the proposed method.
dc.language.iso	englanti
dc.publisher	ACM Press
dc.relation.ispartof	AIIPCC '19: Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing
dc.relation.uri	http://dx.doi.org/10.1145/3371425.3371495
dc.rights	In copyright 1.0
dc.title	DOM-based keyword extraction from web pages
dc.description.version	final draft
dc.contributor.department	School of Computing, activities
uef.solecris.id	67745320	en
dc.type.publication	Artikkelit ja abstraktit tieteellisissä konferenssijulkaisuissa
dc.relation.doi	10.1145/3371425.3371495
dc.description.reviewstatus	peerReviewed
dc.relation.articlenumber	62
dc.relation.isbn	978-1-4503-7633-4
dc.rights.accesslevel	openAccess
dc.type.okm	A4
uef.solecris.openaccess	Ei
dc.rights.copyright	© Association for Computing Machinery
dc.type.displayType	article	en
dc.type.displayType	artikkeli	fi
dc.rights.url	https://rightsstatements.org/page/InC/1.0/

Files in this item

Name:: 15813351321724089194.pdf
Size:: 842.9Kb
Format:: PDF
Description:: Article

Files

This item appears in the following Collection(s)

Luonnontieteiden, metsätieteiden ja tekniikan tiedekunta [1563]
Luonnontieteiden, metsätieteiden ja tekniikan tiedekunta / Faculty of Science, Forestry and Technology

Show simple item record