Please use this identifier to cite or link to this item: http://172.22.28.37:8080/xmlui/handle/1/412
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKadam, Narendradatta Shankarrao-
dc.date.accessioned2018-10-30T07:06:48Z-
dc.date.available2018-10-30T07:06:48Z-
dc.date.issued2013-
dc.identifier.urihttp://localhost:8080/xmlui/handle/1/412-
dc.descriptionUnder the Guidance of Prof. S. S. Patilen_US
dc.description.abstractTheWorldWideWeb is a vast and most useful collection of information. To achieve high productivity in publishing the web pages are automatically evaluated using common templates with contents. The templates are considered harmful because they compromise the relevance judgment of many web information retrieval and web mining methods such as clustering and classification and badly impact the performance and resources of tools that processes the web pages. Thus, the template detection techniques have received a lot of attention to improve the performance of search engines, clustering and classification of web documents. Here, project is presenting the approach to detect and extract the templates from heterogeneous web documents and cluster them into different group. The pages belong to each group should possess the same structure .This saves the time to find out best templates from a large number of web document and also saves the memory which is required to find out the best template structure.en_US
dc.language.isoenen_US
dc.publisherRajarambapu Institute of Technology, Rajaramnagaren_US
dc.subjectMinHashen_US
dc.subject(MDL)en_US
dc.subjectMinimum Description Lengthen_US
dc.subjectparsingen_US
dc.titleA Technique for Automatic Template Extraction From Heterogeneous web pagesen_US
dc.typeThesisen_US
Appears in Collections:M.Tech Computer Science & Engineering

Files in This Item:
File Description SizeFormat 
A Technique for Automatic Template Extraction From Heterogeneous web pages.pdf
  Restricted Access
2.84 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.