
In this video I will be explaining about Clinical text classification using the Medical Transcriptions dataset from Kaggle. We will be doing exploratory data analysis followed by text classification. Let us look at the pitfalls in the data and how domain knowledge can improve the results.
If you like such content please subscribe to the channel here:
https://www.youtube.com/RSREETechNLPAIMLsimplified?sub_confirmation=1
/>
Follow me:
Twitter : https://twitter.com/RsreeTech
Chapters/Timestamps :
00: 00 Clinical Text Classification using Medical Transcriptions dataset from Kaggle
00: 33 Exploratory analysis of Medical Transcription dataset
04: 53 Data Pre-processing
06: 25 Tf-idf feature extraction
08: 28 T-sne plot of tf-idf features
09: 40 PCA for dimensionality reduction on features followed by multiclass Logistic Regression
10: 50 Confusion Matrix and Classification Results
12: 00 Domain knowledge to reduce classes
13: 45 Using scispacy to extract medical entities
14: 30 Tf-idf feature extraction on reduced classes
14: 58 PCA for dimensionality reduction on features followed by multiclass Logistic Regression
15: 26 Confusion Matrix and Classification Results
16: 28 SMOTE for imbalanced data
19: 29 Key takeaways
Link to Data:
https://www.kaggle.com/tboyle10/medicaltranscriptions
Github link:
https://github.com/rsreetech/ClinicalTextClassification/
No comments:
Post a Comment