2018 International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Turkey, 28 - 30 September 2019, pp.554-557
The classification of the documents is at the beginning of the topics that are studied extensively today. Using text similarity, many areas are used, such as whether citations are quoted elsewhere or the information searched in search engines is fast and accurate. A variety of methods are used while looking for similarities between documents. Similarity measurements are made by two basic methods, word-based and sentence-based, during the comparison of several documents. While word-based similarity measurements are made, many distance measurement methods such as Jaccard, Dice, Cosine similarity are used. In this study, the paragraphs in different documents will be broken down by sentence basis and they will be represented by a graph, and a study will be done on the classification of the documents using Hamming distance measurements by XOR method of neighborhood matrices obtained from these documents.