
  • 文章类型: Journal Article
    UNASSIGNED: Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes).
    UNASSIGNED: To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations.
    UNASSIGNED: Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days.
    UNASSIGNED: Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change.
    UNASSIGNED: We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future approaches to shaping a common T2D computable phenotype definition that can be applied to clinical informatics, managing chronic conditions, and additional industry-wide efforts in health care.






  • 文章类型: Journal Article
    BACKGROUND: Neck reflex points or Adler-Langer points are commonly used in neural therapy to detect so-called interference fields. Chronic irritations or inflammations in the sinuses, teeth, tonsils, or ears are supposed to induce tension and tenderness of the soft tissues and short muscles in the upper cervical spine. The individual treatment strategy is based on the results of diagnostic Adler-Langer point palpation. This study investigated the inter- and intra-rater reliability and explored treatment effects.
    METHODS: We performed a randomized controlled trial with 104 inpatients (80.8% female, 51.8 ± 12.74 years) of a German department for internal and integrative medicine. Patients were randomized to individual neural therapy according to the pathological findings (n = 48) or no treatment (n = 56). In each patient, three experienced raters (20-45 years of experience in neural therapy) and two novice raters (medical students) rated Adler-Langer points rigidity on a standardized rating scale (\"strong,\" \"weak,\" \"none\"). The patients independently evaluated the tenderness on palpation of the eight points using the same scale. Pressure pain thresholds were assessed at the eight Adler-Langer points. All patients were retested after 30 min. The five raters were blinded to treatment allocation and assessments of the other raters. Video recordings were obtained to assess the consistency of the areas tested by the different raters.
    RESULTS: Agreement between patients and raters (Cohen\'s kappa = 0.161-0.400) and inter-rater reliability were low (Fleiss kappa = 0.132-0.150). Moreover, the individual agreement (pre-post comparisons in untreated patients) was similarly low even in experienced raters (Cohen\'s kappa = 0.099-0.173). Video documentation suggests that raters do not place their fingers in the correct segments (percentage of correct position: 42.0-60.6%). Pressure pain thresholds at five of the eight Adler-Langer points showed significant changes after treatment compared to none in the control group.
    CONCLUSIONS: Under this artificial experimental setting, this method of Adler-Langer point palpation has not proven to be a reliable diagnostic tool. But it could be shown that, as claimed by the method, the tenderness in five of eight Adler-Langer points decreased after neural therapy.
    Hintergrund Nackenreflexpunkte oder Adler-Langer-Punkte werden in der Neuraltherapie häufig zum Aufspüren sogenannter Störfelder eingesetzt. Chronische Reizungen oder Entzündungen im Bereich der Nasennebenhöhlen, der Zähne, der Mandeln oder der Ohren sollen zu Verspannungen der kurzen Muskeln sowie zu gesteigerter Druckdolenz des Bindegewebes im Bereich der oberen Halswirbelsäule führen. Die individuelle Behandlungsstrategie richtet sich nach den Ergebnissen der diagnostischen Palpation der Adler-Langer-Punkte. Diese Studie untersuchte die Inter- und Intra-Rater-Reliabilität sowie die Behandlungseffekte.Methoden Wir führten eine randomisiert-kontrollierte Studie mit 104 stationären Patienten (80.8% weiblich, 51.8 ± 12.74 Jahre) einer deutschen Abteilung für Innere und Integrative Medizin durch. Die Patienten wurden randomisiert einer individuellen Neuraltherapie entsprechend dem pathologischen Befund (n = 48) oder keiner Behandlung (n = 56) zugewiesen. Bei jedem Patienten bewerteten drei erfahrene Ärzte (20–45 Jahre Erfahrung in der Neuraltherapie) und zwei unerfahrene Untersucher (Medizinstudenten) die Rigidität der Adler-Langer-Punkte auf einer standardisierten Bewertungsskala (“stark,” “schwach,” “keine”). Die Patienten bewerteten ebenfalls die Schmerzempfindlichkeit bei der Palpation der acht Punkte anhand derselben Skala. Die Druckschmerzschwellen wurden an den acht Adler-Langer-Punkten ermittelt. Alle Patienten wurden nach 30 minuten erneut getestet. Die fünf Untersucher waren gegenüber der Behandlungszuweisung und den Bewertungen der anderen Untersucher verblindet. Es wurden Videoaufzeichnungen angefertigt, um die Korrektheit der von den verschiedenen Untersuchern getesteten Bereiche zu bewerten.Ergebnisse Die Übereinstimmung zwischen Patienten und Untersuchern (Cohen’s Kappa = 0.161–0.400) und die Zuverlässigkeit zwischen den Untersuchern waren gering (Fleiss-Kappa = 0.132–0.150). Darüber hinaus war die individuelle Übereinstimmung (Prä-Post-Vergleiche bei unbehandelten Patienten) selbst bei erfahrenen Beurteilern ähnlich gering (Cohen’s Kappa = 0.099–0.173). Die Videodokumentation deutet darauf hin, dass die Untersucher ihre Finger nicht in den richtigen Segmenten platzieren (Prozentsatz der korrekten Position 42.0–60.6%). Die Druckschmerzschwellen an fünf der acht Adler-Langer-Punkte wiesen nach der Behandlung signifikante Veränderungen auf, in der unbehandelten Kontrollgruppe dagegen nicht.Schlussfolgerung Unter diesen artifiziellen experimentellen Bedingungen hat sich die Methode der Palpation der Adler-Langer-Punkte nicht als zuverlässiges diagnostisches Instrument erwiesen. Es konnte jedoch gezeigt werden, dass, wie von der Methode behauptet, die Druckdolenz in fünf von acht Adler-Langer-Punkten nach der Neuraltherapie abnahm.






  • 文章类型: Journal Article
    OBJECTIVE: Blood stasis is the slowing or stagnation of blood and can cause metabolic, musculoskeletal, and gynecological diseases. This study developed the Blood Stasis Questionnaire for gynecological disease (BSQ-GD) by extracting clinical indicators related to gynecological diseases using the Blood Stasis Questionnaires I and II (BSQ-I and II, respectively) and analyzed the clinical data of a cross-sectional study.
    METHODS: In total, 103 women aged between 25 and 65 years who met gynecological disease criteria were enrolled in this study. Blood stasis scores (BSS) were evaluated using the BSQ-II and categorized into BSS and non-BSS groups. To assess the reliability of BSQ-GD, the internal consistency coefficient was employed using Cronbach\'s α. Furthermore, correlation analyses were conducted for the clinical symptoms related to gynecological diseases, and the discriminant validity was confirmed by comparing the two groups. The prediction accuracy was determined using logistic regression and the cut-off value of the BSQ-GD was established via the sensitivity and specificity calculations.
    RESULTS: The BSQ-GD showed satisfactory internal consistency (Cronbach\'s α coefficient = 0.71) and validity, with significant differences in mean scores between blood stasis (22.30 ± 3.34) and non-blood stasis (14.93 ± 3.49) groups. The cut-off value of the BSQ-GD score was 19 points when the Youden index (73.45) and the concordance probability (0.75) were at their maximum. The area under the receiver operating characteristic curve was approximately 96%, and the sensitivity and specificity of the diagnostic accuracy according to the cut-off value are 80.95% and 92.50%, respectively.
    CONCLUSIONS: The BSQ-GD can be an appropriate instrument to estimate blood stasis in patients with gynecological diseases; its diagnostic sensitivity according to the cut-off value is high.
    Ziele Die Blutstase ist eine Verlangsamung oder Stagnation des Blutes und kann metabolische, muskuloskelettale und gynäkologische Erkrankungen verursachen. In der vorliegenden Studie wurde der Fragebogen zur Blutstase bei gynäkologischen Erkrankungen (Blood Stasis Questionnaire for gynecological disease, BSQ-GD) entwickelt, indem klinische Indikatoren im Zusammenhang mit gynäkologischen Erkrankungen aus den Blutstasefragebögen I und II (Blood Stasis Questionnaires I und II, BSQ-I bzw. II) extrahiert und die klinischen Daten einer Querschnittsstudie analysiert wurden.Patientinnen und Methoden Insgesamt wurden 103 Frauen im Alter von 25 bis 65 Jahren, die die Kriterien einer gynäkologischen Erkrankung erfüllten, in diese Studie aufgenommen. Die Blutstase-Scores (BSS) wurden mit dem BSQ-II bewertet und in eine BSS- und eine Nicht-BSS-Gruppe unterteilt. Zur Beurteilung der Zuverlässigkeit des BSQ-GD wurde der Koeffizient für interne Konsistenz Cronbachs Alpha verwendet. Darüber hinaus erfolgten Korrelationsanalysen für die klinischen Symptome im Zusammenhang mit gynäkologischen Erkrankungen und die Diskriminanzvalidität wurde durch den Vergleich der beiden Gruppen bestätigt. Die Vorhersagegenauigkeit wurde durch logistische Regression ermittelt und der Cut-off-Wert des BSQ-GD wurde durch Berechnung der Sensitivität und Spezifität bestimmt.Ergebnisse Der BSQ-GD wies eine zufriedenstellende interne Konsistenz (Cronbachs α-Koeffizient = 0,71) und Validität auf, wobei signifikante Unterschiede in den mittleren Scores zwischen den Gruppen mit Blutstase und ohne Blutstase bestanden (22,30 ± 3,34 bzw. 14,93 ± 3,49). Der Cut-off-Wert des BSQ-GD-Scores lag bei 19 Punkten, wo der Youden-Index (73,45) und die Konkordanzwahrscheinlichkeit (0,75) am höchsten waren. Die Fläche unter der Receiver-Operating-Characteristic-Kurve betrug etwa 96%, und die Sensitivität und Spezifität der diagnostischen Genauigkeit in Abhängigkeit vom Cut-off-Wert lagen bei 80,95% bzw. 92,50%.Schlussfolgerung Der BSQ-GD kann ein geeignetes Instrument zur Beurteilung der Blutstase bei Patientinnen mit gynäkologischen Erkrankungen sein; seine diagnostische Sensitivität entsprechend dem Cut-off-Wert ist hoch.






  • 文章类型: Journal Article
    UNASSIGNED: With the capability to render prediagnoses, consumer wearables have the potential to affect subsequent diagnoses and the level of care in the health care delivery setting. Despite this, postmarket surveillance of consumer wearables has been hindered by the lack of codified terms in electronic health records (EHRs) to capture wearable use.
    UNASSIGNED: We sought to develop a weak supervision-based approach to demonstrate the feasibility and efficacy of EHR-based postmarket surveillance on consumer wearables that render atrial fibrillation (AF) prediagnoses.
    UNASSIGNED: We applied data programming, where labeling heuristics are expressed as code-based labeling functions, to detect incidents of AF prediagnoses. A labeler model was then derived from the predictions of the labeling functions using the Snorkel framework. The labeler model was applied to clinical notes to probabilistically label them, and the labeled notes were then used as a training set to fine-tune a classifier called Clinical-Longformer. The resulting classifier identified patients with an AF prediagnosis. A retrospective cohort study was conducted, where the baseline characteristics and subsequent care patterns of patients identified by the classifier were compared against those who did not receive a prediagnosis.
    UNASSIGNED: The labeler model derived from the labeling functions showed high accuracy (0.92; F1-score=0.77) on the training set. The classifier trained on the probabilistically labeled notes accurately identified patients with an AF prediagnosis (0.95; F1-score=0.83). The cohort study conducted using the constructed system carried enough statistical power to verify the key findings of the Apple Heart Study, which enrolled a much larger number of participants, where patients who received a prediagnosis tended to be older, male, and White with higher CHA2DS2-VASc (congestive heart failure, hypertension, age ≥75 years, diabetes, stroke, vascular disease, age 65-74 years, sex category) scores (P<.001). We also made a novel discovery that patients with a prediagnosis were more likely to use anticoagulants (525/1037, 50.63% vs 5936/16,560, 35.85%) and have an eventual AF diagnosis (305/1037, 29.41% vs 262/16,560, 1.58%). At the index diagnosis, the existence of a prediagnosis did not distinguish patients based on clinical characteristics, but did correlate with anticoagulant prescription (P=.004 for apixaban and P=.01 for rivaroxaban).
    UNASSIGNED: Our work establishes the feasibility and efficacy of an EHR-based surveillance system for consumer wearables that render AF prediagnoses. Further work is necessary to generalize these findings for patient populations at other sites.






  • 文章类型: Journal Article
    BACKGROUND: Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records.
    OBJECTIVE: This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak.
    METHODS: Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children\'s hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F1-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F1-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras.
    RESULTS: There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F1-score=0.796) than ICD-10 codes (F1-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F1-score=0.828 and ICD-10: F1-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras.
    CONCLUSIONS: This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.






  • 文章类型: Journal Article
    BACKGROUND: Medical students in Japan undergo a 2-year postgraduate residency program to acquire clinical knowledge and general medical skills. The General Medicine In-Training Examination (GM-ITE) assesses postgraduate residents\' clinical knowledge. A clinical simulation video (CSV) may assess learners\' interpersonal abilities.
    OBJECTIVE: This study aimed to evaluate the relationship between GM-ITE scores and resident physicians\' diagnostic skills by having them watch a CSV and to explore resident physicians\' perceptions of the CSV\'s realism, educational value, and impact on their motivation to learn.
    METHODS: The participants included 56 postgraduate medical residents who took the GM-ITE between January 21 and January 28, 2021; watched the CSV; and then provided a diagnosis. The CSV and GM-ITE scores were compared, and the validity of the simulations was examined using discrimination indices, wherein ≥0.20 indicated high discriminatory power and >0.40 indicated a very good measure of the subject\'s qualifications. Additionally, we administered an anonymous questionnaire to ascertain participants\' views on the realism and educational value of the CSV and its impact on their motivation to learn.
    RESULTS: Of the 56 participants, 6 (11%) provided the correct diagnosis, and all were from the second postgraduate year. All domains indicated high discriminatory power. The (anonymous) follow-up responses indicated that the CSV format was more suitable than the conventional GM-ITE for assessing clinical competence. The anonymous survey revealed that 12 (52%) participants found the CSV format more suitable than the GM-ITE for assessing clinical competence, 18 (78%) affirmed the realism of the video simulation, and 17 (74%) indicated that the experience increased their motivation to learn.
    CONCLUSIONS: The findings indicated that CSV modules simulating real-world clinical examinations were successful in assessing examinees\' clinical competence across multiple domains. The study demonstrated that the CSV not only augmented the assessment of diagnostic skills but also positively impacted learners\' motivation, suggesting a multifaceted role for simulation in medical education.






  • 文章类型: Randomized Controlled Trial
    Patients with gastric atrophy and intestinal metaplasia (IM) were at risk for gastric cancer, necessitating an accurate risk assessment. We aimed to establish and validate a diagnostic approach for gastric biopsy specimens using deep learning and OLGA/OLGIM for individual gastric cancer risk classification.
    In this study, we prospectively enrolled 545 patients suspected of atrophic gastritis during endoscopy from 13 tertiary hospitals between December 22, 2017, to September 25, 2020, with a total of 2725 whole-slide images (WSIs). Patients were randomly divided into a training set (n = 349), an internal validation set (n = 87), and an external validation set (n = 109). Sixty patients from the external validation set were randomly selected and divided into two groups for an observer study, one with the assistance of algorithm results and the other without. We proposed a semi-supervised deep learning algorithm to diagnose and grade IM and atrophy, and we compared it with the assessments of 10 pathologists. The model\'s performance was evaluated based on the area under the curve (AUC), sensitivity, specificity, and weighted kappa value.
    The algorithm, named GasMIL, was established and demonstrated encouraging performance in diagnosing IM (AUC 0.884, 95% CI 0.862-0.902) and atrophy (AUC 0.877, 95% CI 0.855-0.897) in the external test set. In the observer study, GasMIL achieved an 80% sensitivity, 85% specificity, a weighted kappa value of 0.61, and an AUC of 0.953, surpassing the performance of all ten pathologists in diagnosing atrophy. Among the 10 pathologists, GasMIL\'s AUC ranked second in OLGA (0.729, 95% CI 0.625-0.833) and fifth in OLGIM (0.792, 95% CI 0.688-0.896). With the assistance of GasMIL, pathologists demonstrated improved AUC (p = 0.013), sensitivity (p = 0.014), and weighted kappa (p = 0.016) in diagnosing IM, and improved specificity (p = 0.007) in diagnosing atrophy compared to pathologists working alone.
    GasMIL shows the best overall performance in diagnosing IM and atrophy when compared to pathologists, significantly enhancing their diagnostic capabilities.






  • 文章类型: Journal Article
    Diagnosis is a core component of effective health care, but misdiagnosis is common and can put patients at risk. Diagnostic decision support systems can play a role in improving diagnosis by physicians and other health care workers. Symptom checkers (SCs) have been designed to improve diagnosis and triage (ie, which level of care to seek) by patients.
    The aim of this study was to evaluate the performance of the new large language model ChatGPT (versions 3.5 and 4.0), the widely used WebMD SC, and an SC developed by Ada Health in the diagnosis and triage of patients with urgent or emergent clinical problems compared with the final emergency department (ED) diagnoses and physician reviews.
    We used previously collected, deidentified, self-report data from 40 patients presenting to an ED for care who used the Ada SC to record their symptoms prior to seeing the ED physician. Deidentified data were entered into ChatGPT versions 3.5 and 4.0 and WebMD by a research assistant blinded to diagnoses and triage. Diagnoses from all 4 systems were compared with the previously abstracted final diagnoses in the ED as well as with diagnoses and triage recommendations from three independent board-certified ED physicians who had blindly reviewed the self-report clinical data from Ada. Diagnostic accuracy was calculated as the proportion of the diagnoses from ChatGPT, Ada SC, WebMD SC, and the independent physicians that matched at least one ED diagnosis (stratified as top 1 or top 3). Triage accuracy was calculated as the number of recommendations from ChatGPT, WebMD, or Ada that agreed with at least 2 of the independent physicians or were rated \"unsafe\" or \"too cautious.\"
    Overall, 30 and 37 cases had sufficient data for diagnostic and triage analysis, respectively. The rate of top-1 diagnosis matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 9 (30%), 12 (40%), 10 (33%), and 12 (40%), respectively, with a mean rate of 47% for the physicians. The rate of top-3 diagnostic matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 19 (63%), 19 (63%), 15 (50%), and 17 (57%), respectively, with a mean rate of 69% for physicians. The distribution of triage results for Ada was 62% (n=23) agree, 14% unsafe (n=5), and 24% (n=9) too cautious; that for ChatGPT 3.5 was 59% (n=22) agree, 41% (n=15) unsafe, and 0% (n=0) too cautious; that for ChatGPT 4.0 was 76% (n=28) agree, 22% (n=8) unsafe, and 3% (n=1) too cautious; and that for WebMD was 70% (n=26) agree, 19% (n=7) unsafe, and 11% (n=4) too cautious. The unsafe triage rate for ChatGPT 3.5 (41%) was significantly higher (P=.009) than that of Ada (14%).
    ChatGPT 3.5 had high diagnostic accuracy but a high unsafe triage rate. ChatGPT 4.0 had the poorest diagnostic accuracy, but a lower unsafe triage rate and the highest triage agreement with the physicians. The Ada and WebMD SCs performed better overall than ChatGPT. Unsupervised patient use of ChatGPT for diagnosis and triage is not recommended without improvements to triage accuracy and extensive clinical evaluation.






  • 文章类型: Journal Article
    BACKGROUND: Challenges remain for general practitioners (GPs) in diagnosing (pre)malignant and benign skin lesions. Teledermoscopy (TDsc) supports GPs in diagnosing these skin lesions guided by teledermatologists\' (TDs) diagnosis and advice and prevents unnecessary referrals to dermatology care. However, the impact of the availability of TDsc on GPs\' self-reported referral decisions to dermatology care before and after the TDsc consultation is unknown.
    OBJECTIVE: The objective of this study is to assess and compare the initial self-reported referral decisions of GPs before TDsc versus their final self-reported referral decisions after TDsc for skin lesions diagnosed by the TD as (pre)malignant or benign.
    METHODS: TDsc consultations requested by GPs in daily practice between July 2015 and June 2020 with a TD assessment and diagnosis were extracted from a nationwide Dutch telemedicine database. Based on GP self-administered questions, the GPs\' referral decisions before and their final referral decision after TDsc consultation were assessed for (pre)malignant and benign TD diagnoses.
    RESULTS: GP self-administered questions and TD diagnoses were evaluated for 6364 TDsc consultations (9.3% malignant, 8.8% premalignant, and 81.9% benign skin lesions). In half of the TDsc consultations, GPs adjusted their initial referral decision after TD advice and TD diagnosis. Initially, GPs did not have the intention to refer 67 (56.8%) of 118 patients with a malignant TD diagnosis and 26 (16.0%) of 162 patients with a premalignant TD diagnosis but then decided to refer these patients after the TDsc consultation. Furthermore, GPs adjusted their decision from referral to nonreferral for 2534 (74.9%) benign skin lesions (including 676 seborrheic keratosis and 131 vascular lesions).
    CONCLUSIONS: GPs adjusted their referral decision in 52% (n=3306) of the TDsc consultations after the TD assessment. The availability of TDsc is thus of added value and assists GPs in their (non)referral for patients with skin lesions to dermatology care. TDsc resulted in referrals of patients with (pre)malignant skin lesions that GPs would not have referred directly to the dermatologist. TDsc also led to a reduction of unnecessary referrals of patients with low complex benign skin lesions (eg, seborrheic keratosis and vascular lesions).






  • 文章类型: Journal Article
    Behçet\'s Disease (BD) is a chronic multisystem vasculitis that manifests with destructive inflammation affecting the eyes, central nervous system, and blood vessels. The pathology of vein involvement in BD is poorly characterized. Magnetic resonance (MR) venography gives more comprehensive information about deep veins and adjacent tissues. In this study, we aimed to characterize vein involvement and evaluate the diagnostic utility of MR venography in BD.
    Sixty-five BD patients who fulfilled the International Study Group (ISG) criteria and 20 healthy control subjects were enrolled. Inferior vena cava (IVC), common iliac veins (CIV), external (EIV) and internal iliac veins (IVV), common femoral veins (CFV), femoral veins (FV), and greater saphenous veins (GSV) of BD patients and healthy controls were evaluated with MR venography and ultrasonography for the presence pathologic features, luminal thrombi, vessel wall changes, and perivascular abnormalities.
    33 vascular and 32 non-vascular BD patients (mean age 39.3 ± 11.3 years and 48 [73.8%] male) were enrolled. MR venography revealed diffuse concentric thickening of the walls of IVC, CIV, EIV, IIV, CFV, FV, and GSV in BD (healthy controls vs. BD p<0.05 for all vein segments). MR venography provided additional information about veins and perivascular tissues like contrast enhancement, enlarged lymph nodes, and seminal vesicle vascularization, which were remarkably more frequent in vascular BD than non-vascular BD and healthy controls.
    The results of our study suggest that the involvement of the venous system is diffuse and generalized in BD, and demonstration of venulitis might help diagnose the disease.





