Suicide is one of the ten leading causes of death in the United States. Yet suicide prevention remains difficult. Timely identification, intervention, and monitoring are necessary to extend effective psychiatric care. Clinical interview, patient self-report, family observation, and assessment scales remain the primary sources for assessing risk and monitoring safety. However, existing strategies are limited by reliance on direct and timely contact with trained professionals, as well as accurate and insightful patient and family recall. These standard approaches to suicide assessment do not allow for objective monitoring of suicidal thoughts and actions and typically do not occur with enough frequency, or at the necessary level of detail, to detect important symptomatic exacerbation. Early, precise, and objective identification and monitoring of suicidality could facilitate the initiation of personalized and proactive intervention strategies. The increasing uptake of online platforms, such as search engines and social media, provides a timely opportunity to gather indicators of individuals' psychological state, including suicidal thoughts and behaviors (STBs). At the same time, with ongoing digitization of medical records in many healthcare systems, large clinical datasets for mental health are becoming available for interrogation. To this end, the current proposal aims to identify clinically meaningful signals obtained from the integration of online activity (Internet search and social media) and medical records, to improve psychiatric outcomes for patients with STBs. Specifically, the team will combine online activity data and medical record data from an existing dataset of patients with and without STBs, who are receiving psychiatric care at Northwell Health and collaborating institutions. This data has been collected by the team via several National Institutes of Health-funded initiatives including an R34, two R01s, and a cooperative research agreement, spanning over 1.28 million Facebook, Twitter, Instagram posts, and Google search queries and 12.8 thousand clinical notes (N=368). The team will first identify risk and protective markers of STBs in patients' online activity using natural language processing (NLP) techniques and machine learning (ML) models. Additionally, key contextual information related to suicidal events will be extracted from the same patients' medical records via expert- and NLP based coding of clinical notes, leveraging the Columbia Suicide Severity Rating Scale (C-SSRS). Then, via use of ML, we will examine the extent to which these digital phenotypes can predict current or forthcoming STBs, given by suicide related hospitalizations as well as C-SSRS based coding of suicidal events in medical records. This project expands past and ongoing collaborations of a strong, interdisciplinary team, with expertise in computer science and early interventions in psychiatry. It will help identify key scientific gaps in suicide research, by integrating existing rich repositories of online data and medical records, and by using ML and NLP techniques to quantify STBs and identify multi-dimensional previously unexplored predictors.