Application of the Term Frequency-Inverse Document Frequency Weighting Scheme to the Pauline Corpus
The term frequency--inverse document frequency (TF-IDF) weighting scheme is applied to the text of the thirteen epistles traditionally associated with the Apostle Paul. The data for the analysis is the morphologically tagged text of the Society for Biblical Literature’s Greek New Testament. The TF-I...
Authors: | ; |
---|---|
Format: | Electronic Article |
Language: | English |
Check availability: | HBZ Gateway |
Fernleihe: | Fernleihe für die Fachinformationsdienste |
Published: |
Andrews Univ. Press
2022
|
In: |
Andrews University Seminary studies
Year: 2022, Volume: 59, Issue: 2, Pages: 251-272 |
Online Access: |
Volltext (lizenzpflichtig) |
Summary: | The term frequency--inverse document frequency (TF-IDF) weighting scheme is applied to the text of the thirteen epistles traditionally associated with the Apostle Paul. The data for the analysis is the morphologically tagged text of the Society for Biblical Literature’s Greek New Testament. The TF-IDF scheme is then used to construct the Document Term Matrix (DTM) for a corpus under consideration. The DTM allows each document to be represented by a multi-dimensional document vector. A query document is then chosen and a vector representation of it is constructed. The cosine similarity between the query document and documents in the corpus is calculated. The following pairs of documents are consistently found to have the highest similarity: (1) Romans and Galatians, (2) Ephesians and Colossians and (3) First Timothy and Titus. It is shown that computational methods may be applied to the thirteen epistles and that the results are in accordance with those obtained from theological or literary analysis. |
---|---|
Item Description: | Abweichende Erscheinungsjahre 2021/2022 in der gedruckten und der Online-Ausgabe |
Contains: | Enthalten in: Andrews University. Seventh-Day Adventist Theological Seminary, Andrews University Seminary studies
|