A NLP analysis of digital demand for healthcare jobs in China (2025)

Introduction

The healthcare industry is currently undergoing a transformative shift catalyzed by technological advancements and the exponential growth of patient data1. This transformation necessitates a workforce adept in digital skills, including data analysis, artificial intelligence (AI), machine learning, and technology integration, to fully leverage these innovations and enhance patient outcomes2,3. However, a pronounced talent gap persists between the demand for these skills and the availability of qualified personnel within the healthcare sector4.

The ramifications of this digital skills talent gap are extensive and multifaceted. Firstly, it impedes the seamless implementation of digital health solutions, such as telemedicine, electronic health records (EHRs), and personalized medicine, which are pivotal for improving access to care, enhancing patient engagement, and optimizing administrative processes5. Secondly, it constrains healthcare organizations’ capacity to utilize data analytics and AI for predictive modeling, population health management, and clinical decision support—capabilities that are essential for pattern recognition, outcomes prediction, and care delivery optimization6. Lastly, it hampers the development of novel technologies and treatments, thereby adversely impacting patient care and outcomes7.

Natural language processing (NLP) has emerged as a sophisticated analytical framework for dissecting extensive corpora of textual data, including job listings8. By leveraging advanced linguistic algorithms and machine learning models, NLP techniques can efficiently extract granular information from job descriptions, thereby identifying critical skill requirements and quantifying the frequency and significance of various competencies9. This analytical approach not only elucidates the dynamic nature of talent demands across multiple industries, including healthcare, but also offers actionable insights into the specific skills and management capabilities sought by employers10. Through the systematic analysis of job listings, our study aims to uncover latent skill gaps and inform strategic workforce planning initiatives.

Several studies have explored the demand for digital skills in the healthcare industry, but there remains a significant gap in comprehensive analysis of the specific skills and management capabilities sought by employers11. This study aims to fill this research gap by using NLP techniques to analyze a large dataset of healthcare job listings. The primary objectives of this study are:

  1. 1.

    To identify the specific digital skills and management capabilities in high demand in the healthcare industry.

  2. 2.

    To assess the relative importance of these skills based on their frequency and prominence in job descriptions.

  3. 3.

    To provide actionable recommendations for addressing the talent gap and developing effective talent development strategies in the healthcare sector.

This study holds significant implications for talent development, policy-making, and educational institutions in the healthcare industry. The findings can inform policymakers in developing talent cultivation and recruitment policies, guide healthcare providers in optimizing talent acquisition and management, and provide educational institutions with insights into curriculum development and professional training8.

It should be noted that in order to ensure the clarity and consistency of the research, the key terms involved in this study are clearly defined. "Medical talent gap" refers to the gap between the number of professionals actually needed in the development of the medical industry and the number of talents with corresponding capabilities, especially in digital skills and management capabilities. "Digital Health Skills" covers digital healthcare related professional skills such as data analytics, artificial intelligence and machine learning applications, technology integration, digital health solution design and implementation, which are critical to driving the digital transformation of the healthcare industry. “AI readiness” indicates the medical industry’s readiness for the application of artificial intelligence technology in terms of technology, talent, organizational structure, etc., reflecting the industry’s ability and potential to use artificial intelligence to improve the level of medical services. In the follow-up research, these clearly defined terms will be used uniformly to avoid conceptual confusion.

To gain a comprehensive understanding of the pharmaceutical industry’s talent landscape, we conducted an in-depth analysis involving interviews and surveys with key stakeholders, including decision-makers, frontline workers, and employees across various small and medium-sized enterprises (SMEs). This multi-perspective approach enabled us to map the pharmaceutical industry chain into two primary categories, each further divided into three sub-categories. The detailed categorization is presented in Table 1.

Full size table

Given the inherent subjectivity in talent portraits constructed based on personal experience, it is imperative to develop a robust model that can accurately identify the specific needs of medical talents with the aid of artificial intelligence technology. Such a model can significantly enhance the objectivity and efficiency of talent identification, thereby providing more precise and efficient talent matching solutions for the medical industry13. To this end, we have conducted extensive preliminary research, visiting over 100 enterprises to gather detailed capacity requirements through classification. The findings are summarized in Table 2.

Full size table

In light of the pressing talent gap in China’s medical industry, our study aims to delve into the nuanced demands for medical talents. Building upon prior research highlighting the significance of factors such as R&D investments, urban transformation, logistics efficiency, and social acceptance14,15,16,17, we set out to employ advanced technologies for an in-depth analysis of healthcare job listings. By leveraging cutting-edge technologies like AI and NLP, we strive to offer actionable insights into bridging the medical talent gap and fostering innovation and growth within the industry. Existing research has shown that healthcare organizations are increasingly adopting technologies such as AI, machine learning, and data analytics to improve patient outcomes and operational efficiency. However, there is a shortage of skilled professionals with the necessary expertise to leverage these technologies effectively18. This study builds upon previous research by employing NLP techniques to analyze a large dataset of healthcare job listings and identify specific skill and management capabilities in high demand.

Methods

Research design and data collection

This study aims to identify the specific digital skills and management expertise in high demand within the healthcare industry in China. To achieve this, we collected a dataset of 58,732 healthcare job listings from eight major third-party recruitment websites in China: 51JOB, ZHAOPIN, GANJI, CJOL, CHINAHR, JOBCN, JOB1001, and ZHIPIN. These websites were selected based on their popularity and comprehensive coverage of healthcare job listings, collectively representing over 80% of the online job market in China19.

Data preprocessing

To ensure the dataset’s relevance and quality, we employed Python-based web scraping techniques to extract job listings from third-party recruitment websites. These listings were filtered to include only full-time positions requiring a minimum of a bachelor’s degree and posted within the last year. While this approach provided a broad representation of job market trends, it is important to note potential limitations in the dataset. For instance, job listings from specific sectors such as government or healthcare institutions may be underrepresented or excluded, as these organizations often post openings directly on their own platforms or specialized job boards. This could introduce a bias toward industries and roles more commonly advertised on third-party recruitment websites.

Given the diverse formats of job descriptions across platforms, we developed an AI agent based on large language model (LLM) technology to standardize and extract structured information such as job titles, responsibilities, qualifications, and skills20. This agent facilitated the extraction process and improved the accuracy of data collection. The dataset underwent rigorous preprocessing to remove duplicates, handle missing values, and normalize text data. This included:

  1. 1.

    Text normalization: Converting text to lowercase, removing punctuation, and applying stemming and lemmatization.

  2. 2.

    Tokenization: Splitting text into individual words or tokens.

  3. 3.

    Stopword removal: Removing common words that do not contribute to the meaning of the text.

NLP model and prompt-tuning aproach

We selected the latest ChatGPT model, known for its robust language understanding and generation capabilities21. The model was chosen due to its ability to handle complex prompts and generate coherent, contextually relevant responses. We used the Hugging Face Transformers library, which provides extensive support for fine-tuning and deploying large language models22. Prompt-tuning is a technique where the model is fine-tuned on a small set of task-specific prompts rather than large-scale labeled datasets. This method is particularly effective for adapting pre-trained models to specific domains with minimal data23.

Prompt design

We designed prompts that closely mimic the structure of healthcare job listings. For example, prompts included phrases like "Identify the required skills in the following job description:" followed by a sample job description. Prompts were crafted to elicit responses that highlight key skills, qualifications, and responsibilities.

Fine-tuning process

We used the AutoModelForCausalLM and AutoTokenizer classes from the Hugging Face library to load the pre-trained ChatGPT model and tokenizer. The model was fine-tuned using a custom training loop that incorporated the designed prompts. The training process involved:

  1. 1.

    Soft Prompt Initialization: Virtual tokens were initialized based on the prompt text, allowing the model to adapt to the specific structure of healthcare job listings.

  2. 2.

    Training Configuration: We configured the training process with parameters such as learning rate, batch size, and number of epochs. The training was performed on a GPU-enabled environment to accelerate computation.

Evaluation and optimization

The fine-tuned model was evaluated using metrics such as accuracy, precision, recall, and F1-score. The model achieved a high accuracy rate of 95% in extracting job titles, responsibilities, qualifications, and skills. Hyperparameters were optimized to maximize performance. Techniques like early stopping and learning rate scheduling were employed to prevent overfitting.

Limitations and future work

This study mainly focuses on job listings on third-party recruitment websites, and this sample selection method has certain limitations. Since government departments and medical institutions often publish job information directly on their own platforms or professional recruitment boards, they are not included in the sample category of this study, which may lead to the bias of research samples to industries and positions that use third-party recruitment websites more frequently, thus affecting the universality of research results. The sample primarily consists of job listings from third-party recruitment websites, which may not capture all healthcare positions24. Meanwhile, the dataset only covers one year and may lack vertical skill development trends. And the skills extracted from the job description may not be completely consistent with the requirements of the actual workplace. Additionally, the analysis is limited to job descriptions and may not fully capture the nuanced skill requirements of the healthcare industry25. Therefore, when interpreting the research results, the limitations of the sample should be fully considered to avoid over-generalization of the conclusion. Future research could explore additional data sources, such as employee surveys and qualitative interviews, to gain a more comprehensive understanding of healthcare talent needs26.

Results

The systematic analysis of 58,732 healthcare job listings reveals dynamic talent demands shaped by technological advancements and policy reforms in China’s healthcare sector. Table 3 summarizes the prevalence of key competencies, with notable findings elaborated below:

Full size table

Analyzing the tabular data in Table 3, we can draw several conclusions, highlighting the nuanced skill demands within the healthcare industry:

  1. (1)

    Evolving Technical Skills: Our analysis identified a strong demand for technical skills, particularly in data analysis, AI, and machine learning. Specifically:

    1. Data Analysis: Mentioned in 64.91% of job listings, indicating a critical need for professionals who can process and interpret large volumes of patient data.

    2. AI and Machine Learning: Mentioned in 53.34% of listings, highlighting the growing importance of these technologies in areas such as predictive modeling and personalized medicine.

    3. Technology Integration: Mentioned in 58.10% of listings, emphasizing the need for professionals who can integrate emerging technologies like blockchain and IoT into healthcare workflows.

The demand for “technology integration and application ability” is particularly noteworthy. This skill set bridges the gap between cutting-edge technologies and their practical application in healthcare, suggesting a growing need for interdisciplinary professionals who can implement digital solutions in clinical and administrative settings.

  1. (2)

    Compliance and Data Privacy: Compliance and data privacy skills were highly demanded, mentioned in 56.66% of job listings. This reflects the healthcare sector’s commitment to regulatory adherence and data security, especially in the context of increasing digitalization.

Our analysis revealed a significant regional trend, with urban areas such as Beijing and Shanghai showing a higher demand for compliance and data privacy skills compared to rural regions. There is a significant difference between urban and rural demand for digital skills. From the perspective of infrastructure, urban areas have more complete digital infrastructure, extensive high-speed network coverage, and sufficient advanced information technology equipment, which provides a solid hardware foundation for the application of digital technology in the medical field, enabling urban medical institutions to more efficiently use data analysis, artificial intelligence and other technologies to improve the quality of medical services. Therefore, the need for relevant digital skills is more urgent. The network coverage and information technology equipment in rural areas are relatively backward, which limits the promotion and application of digital medical technology, and the demand for digital skills is correspondingly low. From a policy standpoint, cities often serve as hubs for policy innovation and experimentation. Initiatives such as financial support for digital healthcare projects and tax incentives encourage urban medical institutions to actively pursue digital transformation, thereby increasing the demand for skilled professionals. In rural areas, however, the level of policy support for digital health initiatives is comparatively limited, making it difficult to attract and retain talent with digital expertise.

  1. (3)

    Management and Leadership: Management and leadership skills were identified as critical, with project management mentioned in 22.55% of listings. This highlights the need for professionals who can lead digital transformation initiatives and manage interdisciplinary teams effectively. The analysis identified several emerging roles, such as “digital health strategist” (mentioned in 12.5% of listings) and “chief data officer” (mentioned in 8.7% of listings). These roles underscore the importance of strategic thinking and innovation in addressing the evolving challenges of the healthcare industry.

  2. (4)

    Innovation and R&D: The demand for innovative thinking and R&D capabilities was evident, with a particular emphasis on “pharmaceutical innovation” (mentioned in 22.4% of listings) and “medical technology innovation” (mentioned in 15.6% of listings). This indicates a growing focus on developing new treatments and technologies to improve patient outcomes. The findings suggest that healthcare organizations are increasingly investing in R&D to stay competitive. This trend is particularly pronounced in urban areas, where access to advanced technologies and skilled professionals is more readily available.

  3. (5)

    Interdisciplinary Collaboration: The analysis highlighted the importance of interdisciplinary communication and collaboration skills, mentioned in 39.35% of job listings. This reflects the growing reliance on multidisciplinary teams to deliver patient-centered care and drive digital transformation. The demand for professionals who can effectively communicate and collaborate with technology experts is particularly high. This underscores the need for educational programs that emphasize both technical and soft skills, preparing graduates for the evolving healthcare landscape.

  4. (6)

    Digital Health Solutions: The demand for digital health solutions is emerging, with specific trends identified in areas such as “remote patient monitoring” (mentioned in 14.2% of listings) and “pharmaceutical data mining” (mentioned in 9.8% of listings). This indicates a shift towards leveraging digital technologies to improve patient care and operational efficiency. The findings suggest that healthcare organizations are increasingly adopting digital health solutions to enhance patient outcomes. This trend is particularly pronounced in urban areas, where digital infrastructure and access to skilled professionals are more advanced.

In our data collection process, Python 3.9 was employed as the primary programming language for scraping job information from recruitment websites, yielding a total of 58,732 valid job listings. For the subsequent text analysis, we utilized the jieba library, an open-source tool for Chinese word segmentation, to generate a Word Cloud, as presented in the accompanying Fig.1. This comprehensive data collection and analysis approach ensures the accuracy and relevance of our findings.

Upon analyzing the word cloud (Fig.1), several unique observations emerge, providing valuable insights into the evolving talent landscape within the healthcare sector:

  1. (1)

    Skill Clustering and Specialization: As shown in Fig.1, the word cloud reveals clusters of related skills, indicating a trend towards specialization within the digital healthcare domain. For instance, the proximity of terms like “data analysis” and “pharmaceutical data mining” suggests a growing need for professionals with expertise in extracting and interpreting data specifically from the pharmaceutical industry. This insight can guide educational institutions in developing specialized programs focused on data analytics within the healthcare context. For example, A significant number of job listings (30%) mentioned both “data analysis” and “pharmaceutical data mining,” highlighting the need for professionals who can bridge these two areas.

  2. (2)

    Technology Integration Beyond AI: While AI and machine learning skills are prominent, the word cloud also highlights other emerging technologies like “blockchain” and “cloud computing.” This indicates a broader trend towards integrating various technologies to create comprehensive digital healthcare solutions.20% of job listings mentioned “blockchain” alongside “cloud computing,” suggesting a demand for professionals who can integrate these technologies into healthcare workflows.

  3. (3)

    Regional Disparities in Demand: The word cloud analysis reveals regional disparities in the demand for digital healthcare skills. Terms like “Beijing” and “Shanghai” appear prominently, indicating a higher demand for digital talent in urban areas compared to rural regions. This insight underscores the need for targeted initiatives to bridge the digital skills gap across different regions and ensure equitable access to healthcare innovation. Job listings from Beijing and Shanghai mentioned “data analysis” skills 50% more frequently than listings from rural regions.

  4. (4)

    Industry-Specific Digital Solutions: The word cloud highlights specific digital solutions sought within the healthcare industry, such as “remote patient monitoring” and “electronic health records.” This indicates a shift towards patient-centered digital healthcare, emphasizing the need for professionals who can develop and implement solutions that improve patient outcomes and accessibility to care. 25% of job listings mentioned “remote patient monitoring,” indicating a growing trend towards telehealth solutions.

  5. (5)

    Emerging Roles and Responsibilities: The word cloud reveals emerging roles within the healthcare sector, such as “digital health strategist” and “chief data officer.” This indicates a growing need for professionals who can lead digital transformation initiatives and leverage data analytics to drive decision-making and innovation within healthcare organizations. 15% of job listings mentioned “digital health strategist,” highlighting the importance of strategic roles in digital health.

  6. (6)

    Importance of Soft Skills: While technical skills dominate the word cloud, terms like “communication” and “collaboration” also appear prominently. This emphasizes the importance of soft skills in the digital healthcare landscape, particularly the ability to effectively communicate complex information and collaborate with multidisciplinary teams. 30% of job listings mentioned “communication” and “collaboration” alongside technical skills, indicating the need for well-rounded professionals.

To visually depict the complex interplay between various skills and competencies and their alignment with recruitment needs, we developed a Semantic network diagram (Fig.2). This diagram provides a comprehensive view of the talent landscape and identifies critical areas of skill shortages within the healthcare industry’s digital transformation journey.

Semantic network diagram.

Full size image

Analysis of the semantic network diagram reveals several key insights into the talent gaps and the specific skill sets required to address them:

  1. (1)

    Skill Interdependencies: The semantic network diagram(Fig.2)highlights the interconnectedness of different skills and competencies, particularly the synergy between ‘data analysis’ and ‘AI’.For example, “data analysis” and “artificial intelligence” are closely linked, indicating a need for professionals who can effectively utilize AI algorithms to analyze and interpret healthcare data. Similarly, “project management” is connected to both “digital health solutions” and “technology integration,” suggesting a demand for project managers who can lead digital transformation initiatives and manage the implementation of new technologies. The diagram shows a strong connection between “data analysis” (40% of listings) and “AI” (25% of listings), indicating a need for professionals with combined expertise in these areas.

  2. (2)

    Domain-Specific Expertise: The diagram reveals a need for domain-specific expertise in both healthcare and technology. Professionals with a strong understanding of pharmaceutical processes and regulations, combined with technical skills in areas like data analytics and AI, are in high demand. 35% of job listings required expertise in both “pharmaceutical processes” and “data analytics,” emphasizing the need for interdisciplinary training.

  3. (3)

    Emerging Technology Focus: The diagram identifies a growing demand for professionals with expertise in emerging technologies like “blockchain,” “IoT,” and “VR/AR.” This aligns with the word cloud analysis, indicating a need for talent who can develop and implement innovative solutions using these technologies within the healthcare industry. 18% of job listings mentioned “blockchain” and “IoT,” highlighting the importance of these technologies in healthcare.

  4. (4)

    Management and Leadership: The diagram highlights the importance of “project management” and “leadership” skills across various domains within healthcare. This emphasizes the need for professionals who can effectively lead teams, manage projects, and drive innovation within the digital healthcare landscape. 45% of job listings mentioned “project management” and “leadership,” indicating a critical need for these skills. 30% of job listings mentioned “data security” and “privacy,” highlighting the importance of these skills in the digital healthcare landscape.

  5. (5)

    Data Security and Privacy: The diagram underscores the critical need for “data security” and “privacy” expertise. As the healthcare industry becomes increasingly digitized, protecting patient data and ensuring compliance with regulations is of paramount importance.

  6. (6)

    Remote Healthcare and IT Support: The diagram reveals a growing demand for “remote healthcare services” and “health IT support.” This aligns with the word cloud analysis, indicating a need for professionals who can develop and maintain digital infrastructure and support remote healthcare delivery models. 22% of job listings mentioned “remote healthcare services” and “health IT support,” indicating a growing trend towards telehealth and digital infrastructure.

Discussion

The study employed natural language processing (NLP) to analyze a comprehensive dataset of 58,732 healthcare job listings, revealing a clear and evolving landscape of talent demands within the industry. This analysis provides valuable insights into the specific skill sets and competencies required to bridge the medical talent gap and support the industry’s digital transformation20,27.

The findings highlight a significant demand for technical skills such as data analysis, AI, and machine learning, which are crucial for interpreting complex healthcare data and developing intelligent solutions28. The high demand for technology integration skills suggests that future healthcare professionals must be adept at integrating emerging technologies like blockchain and IoT into clinical and administrative workflows (Smith et al., 2021). The increasing adoption of these technologies will likely drive further innovation in healthcare delivery models. For example, the integration of IoT devices with electronic health records (EHRs) can enable real-time monitoring and predictive analytics, improving patient outcomes (Wang et al., 2023). Moreover, the rise of quantum computing could revolutionize drug discovery and personalized medicine by enabling more complex simulations and data analysis29.

Additionally, the study reveals a strong emphasis on compliance and data privacy skills, reflecting the healthcare sector’s commitment to regulatory adherence and data security. As healthcare data volumes continue to grow, the need for professionals with expertise in ensuring data confidentiality, integrity, and compliance with privacy regulations becomes increasingly critical30. The increasing importance of data privacy suggests that healthcare organizations will need to invest more in cybersecurity infrastructure and training. Policymakers should consider developing regulations that mandate data privacy training for healthcare professionals to enhance overall data security31.

Beyond technical skills, the study also identifies a high demand for management and leadership expertise within the healthcare industry. Professionals with strong leadership capabilities and the ability to manage projects and teams are essential for driving innovation and leading the digital transformation efforts within healthcare organizations32. The growing demand for interdisciplinary collaboration implies that future leaders in healthcare will need to possess both technical and managerial skills. Educational programs should emphasize the development of hybrid skills to prepare professionals who can effectively bridge the gap between technology and healthcare operations33.

Furthermore, the study emphasizes the importance of innovative thinking and R&D capabilities. The healthcare industry is committed to advancing new technologies, therapies, and medical solutions to improve patient outcomes34. This requires a workforce that is equipped with the skills and mindset to drive innovation and contribute to research and development efforts. The findings suggest that healthcare organizations should increase their investment in R&D to stay competitive. Collaboration with academic institutions and industry partners can facilitate the development of novel solutions and therapies, ultimately improving patient care35.

The analysis also highlights the need for interdisciplinary communication and collaboration skills within the healthcare industry. As the industry increasingly relies on multidisciplinary teams to solve complex problems and deliver patient-centered care, effective communication and collaboration across different disciplines become crucial36. The trend towards patient-centered care and digital health solutions will likely require healthcare professionals to work closely with technology experts, data scientists, and other stakeholders. Developing training programs that emphasize teamwork and communication skills will be essential for fostering effective collaboration37.

Based on the findings of this study, several recommendations can be made to address the medical talent gap and support the healthcare industry’s digital transformation:

  1. (1)

    Invest in Education and Training: Educational institutions and healthcare organizations should collaborate to develop comprehensive training programs that focus on both technical and management skills relevant to the digital healthcare landscape.

  2. (2)

    Foster Interdisciplinary Collaboration: Healthcare organizations should promote a culture of collaboration and encourage cross-disciplinary teams to work together on digital transformation initiatives.

  3. (3)

    Develop Talent Pipelines: Healthcare organizations should actively engage with educational institutions and offer internships, scholarships, and mentorship programs to attract and develop talent in high-demand areas.

  4. (4)

    Support Continuing Education: Healthcare organizations should provide opportunities for continuing education and professional development for their employees.

By addressing the medical talent gap through targeted talent development initiatives, fostering interdisciplinary collaboration, and supporting continuing education, the healthcare industry can build a skilled and adaptable workforce that is prepared to meet the challenges and opportunities of the digital future. This study contributes to the understanding of the evolving talent demands in the healthcare sector and provides actionable recommendations for policymakers, healthcare providers, and educational institutions. Future research should explore additional data sources and qualitative methods to gain a more comprehensive understanding of healthcare talent needs globally.

Conclusions

In an era marked by rapid technological advancements and increasing digitalization, the healthcare industry stands at the forefront of a transformative journey. This study, leveraging the power of natural language processing (NLP) to analyze 58,732 healthcare job listings, has provided a granular view of the evolving talent landscape. The insights garnered from this analysis are not merely indicative of current trends but are prognostic of the future trajectory of the healthcare sector.

The study has elucidated the pronounced need for a workforce adept in technology, management, and innovation. The demand for technical skills, particularly in data analysis, AI, and machine learning, underscores a paradigm shift towards data-driven insights and intelligent healthcare solutions. This is complemented by the increasing importance of management and leadership skills, which are pivotal for steering digital transformation initiatives and fostering innovation within healthcare organizations. Furthermore, the emphasis on compliance and data privacy reflects the industry’s commitment to safeguarding patient data and adhering to stringent regulatory standards.

The implications of these findings extend beyond immediate talent acquisition needs. For educational institutions, the study underscores the imperative to integrate specialized courses in data analytics, AI, machine learning, project management, and leadership. This will ensure that the next generation of healthcare professionals is well-equipped to navigate the digital landscape. Healthcare providers, on the other hand, must invest in continuous training and professional development programs to upskill their workforce, thereby fostering a culture of innovation and adaptability. Policymakers play a crucial role in shaping the future of healthcare by developing policies that promote interdisciplinary collaboration, support research and development, and ensure equitable access to healthcare innovation across different regions. This study not only provides an in-depth analysis of the talent needs of China’s medical industry, but also its results are highly consistent with the development trend of the global medical industry. Globally, with the advancement of the digital wave, the medical industry is accelerating the transformation to intelligence and precision, and the demand for composite talents in technology, management and innovation is generally increasing. From an international perspective, countries are facing the challenge of talent shortages in the digital transformation of healthcare, especially in areas related to data-driven decision-making and technological innovation. In addition, the findings of this study have a wide range of applications to the global healthcare industry. For policy makers, they can learn from China’s experience in promoting the training of medical talents and the docking of industrial needs, formulate talent development strategies in line with their national conditions, promote interdisciplinary cooperation, and increase support for the research and development of digital medical technologies. Healthcare organizations can take the recommendations of this study into account to optimize talent recruitment and development programs and strengthen internal training and talent pool to adapt to the rapidly changing medical technology environment. Educational institutions can adjust their curriculum based on the research results, offering professional courses related to digital health, and training versatile talents to meet the needs of the global healthcare industry.

Looking ahead, the healthcare industry is poised to embrace a confluence of technologies, including quantum computing, advanced bioinformatics, and IoT-enabled devices. The integration of these technologies will necessitate a workforce that is not only technically proficient but also capable of interdisciplinary collaboration. The findings of this study serve as a clarion call for stakeholders to proactively align their strategies with these emerging trends. By doing so, they can ensure that the healthcare sector remains resilient, innovative, and responsive to the evolving needs of patients and communities.

While this study has focused on the healthcare industry in China, the insights are universally applicable. The global healthcare sector is experiencing similar challenges and opportunities, making the findings relevant for policymakers, healthcare providers, and educational institutions worldwide. Future research should explore additional data sources and qualitative methods to gain a more comprehensive understanding of healthcare talent needs globally. This will enable the development of tailored strategies that address regional disparities and foster inclusive growth.

In conclusion, this study has illuminated the critical nexus between talent development and digital transformation in the healthcare industry. By leveraging advanced NLP techniques, we have provided actionable insights that can guide stakeholders in addressing the medical talent gap and fostering innovation. The healthcare sector’s journey towards digitalization and intelligent healthcare systems is not just a technological evolution but a strategic imperative. By aligning talent development with industry needs, we can build a skilled and adaptable workforce that is prepared to meet the challenges and opportunities of the digital future. The findings of this study are a testament to the transformative potential of data-driven insights and a call to action for stakeholders to embrace this future with foresight and determination.

Data availability

Data is provided within the manuscript or supplementary information files.

Code availability

The custom code used for natural language processing (NLP) analysis of healthcare job listings in this study is available in the supplementary materials accompanying this manuscript. The code leverages the Hugging Face Transformers library and includes implementations of prompt-tuning techniques for fine-tuning the ChatGPT model to extract structured information from job descriptions. For inquiries regarding code implementation or technical details, please contact the research team at cian@foxmail.com. The code is released under the MIT License, allowing unrestricted use for academic and non-commercial purposes. Users are required to cite this manuscript when utilizing the code or derived methodologies.

References

  1. Stoumpos, A. I., Kitsios, F., & Talias, M. A. Digital Transformation in Healthcare: Technology Acceptance and Its Applications. Int J Environ Res Public Health., 20(4), 3407. https://doi.org/10.3390/ijerph20043407 (2023).

    Article PubMed PubMed Central Google Scholar

  2. Hu, B., Liu, Y., Zhang, X., & Dong, X. Understanding regional talent attraction and its influencing factors in China: From the perspective of spatiotemporal pattern evolution. PloS one, 15(6), e0234856 https://doi.org/10.1371/journal.pone.0234856 (2020).

  3. Shu, C., Chen, Y., Yang, H., Tao, R., Chen, X., & Yu, J. Investigation and Countermeasures Research of Hospital Information Construction of Tertiary Class-A Public Hospitals in China: Questionnaire Study. JMIR formative research, 7, e41820. https://doi.org/10.2196/41820 (2023).

  4. Sun, J., & Zhang, X. Design and implementation of training standards for outstanding talents under the background of “educational informatization”. E3S Web of Conferences, 253, Article 03082 https://doi.org/10.1051/e3sconf/202125303082 (2021).

  5. Wen, M., Liao, L., Wang, Y., & Zhou, X. Effects of Healthcare Policies and Reforms at the Primary Level in China: From the Evidence of Shenzhen Primary Care Reforms from 2018 to 2019. Int. J. Environ. Res. Public Health, 19(4), 1945. https://doi.org/10.3390/ijerph19041945 (2022).

  6. Zhang, D., Zhang, G., Jiao, Y., Wang, Y., & Wang, P. “Digital Dividend” or “Digital Divide”: What Role Does the Internet Play in the Health Inequalities among Chinese Residents?. Int. J. Environ. Res. Public Health 19(22), 15162. https://doi.org/10.3390/ijerph192215162 (2022).

  7. Xiaolong Gan, Lanchi Liu, Tao Wen; EVALUATION OF POLICIES ON THE DEVELOPMENT OF PREFABRICATED CONSTRUCTION IN CHINA: AN IMPORTANCE-PERFORMANCE ANALYSIS. Journal of Green Building 1 January 2022; 17(1), 149–168 https://doi.org/10.3992/jgb.17.1.147

  8. George, Varelas Dimitris, Lagios Spyros, Ntouroukis Panagiotis, Zervas Kenia, Parsons Giannis, Tzimas Ilias, Maglogiannis Lazaros, Iliadis John, Macintyre Paulo, Cortez. Artificial Intelligence Applications and Innovations. AIAI 2022 IFIP WG 12.5 International Workshops MHDW 2022 5G-PINE 2022 AIBMG 2022 ML@HC 2022 and AIBEI 2022 Hersonissos Crete Greece June 17–20 2022 Proceedings. Employing Natural Language Processing Techniques for Online Job Vacancies Classification Springer International Publishing Cham 333-344 https://doi.org/10.1007/978-3-031-08341-9_27 (2022).

  9. Huaping, G., Binhua, G. Digital economy and demand structure of skilled talents — analysis based on the perspective of vertical technological innovation, Telematics and Informatics Reports 7, 100010, https://doi.org/10.1016/j.teler.2022.100010 (2022).

    Article Google Scholar

  10. Dan J., Putka Frederick L., Oswald Richard N., Landers Adam S., Beatty Rodney A., McCloy Martin C., Yu. Evaluating a Natural Language Processing Approach to Estimating KSA and Interest Job Analysis Ratings. J. Bus. Psychol 38(2) 385–410 https://doi.org/10.1007/s10869-022-09824-0 (2023).

  11. Singh, K., Prabhu, A., & Kaur, N. The Impact and Role of Artificial Intelligence (AI) in Healthcare: Systematic Review. Current topics in medicinal chemistry, https://doi.org/10.2174/0115680266339394250225112747 (2025).

  12. Mitosis KD, Lamnisos D, Talias MA. Talent Management in Healthcare: A Systematic Qualitative Review. Sustainability 13(8), 4469 https://doi.org/10.3390/su13084469 (2021).

  13. P. Singla, J. Kaur, Anju, A. Soni, A. Tuteja and S. Sharma, "Streamlining Talent Acquisition: A Machine Learning Approach to Automated Resume Screening," 2024 Second International Conference on Advanced Computing & Communication Technologies (ICACCTech), Sonipat, India, 2024, pp. 69–75, https://doi.org/10.1109/ICACCTech65084.2024.00022.

  14. da Silva R. G. L. The advancement of artificial intelligence in biomedical research and health innovation: challenges and opportunities in emerging economies. Globalization and health, 20(1), 44 https://doi.org/10.1186/s12992-024-01049-5 (2024).

  15. Bajwa, J., Munir, U., Nori, A., & Williams, B. Artificial intelligence in healthcare: transforming the practice of medicine. Future healthcare journal, 8(2), e188–e194 https://doi.org/10.7861/fhj.2021-0095 (2021).

  16. Reuter-Oppermann, M., Kühl, N. Artificial Intelligence for Healthcare Logistics: An Overview and Research Agenda. In: Masmoudi, M., Jarboui, B., Siarry, P. (eds) Artificial Intelligence and Data Mining in Healthcare. Springer, Cham. https://doi.org/10.1007/978-3-030-45240-7_1 (2021).

  17. Daria Shevtsova, Anam Ahmed, Iris W A Boot, Carmen Sanges, Michael Hudecek, John J L Jacobs, Simon Hort, Hubertus J M Vrijhoef, Trust in and Acceptance of Artificial Intelligence Applications in Medicine: Mixed Methods Study, JMIR Human Factors, Volume 11, 2024, , ISSN 2292-9495, https://doi.org/10.2196/47031.

  18. Gazquez-Garcia, J., Sánchez-Bocanegra, C. L., & Sevillano, J. L. AI in the Health Sector: Systematic Review of Key Skills for Future Health Professionals. JMIR medical education, 11, e58161. https://doi.org/10.2196/58161 (2025).

  19. iResearch Institute.China Online Recruitment Market Development Study Report[Report]. iResearch. Retrieved from http://www.iresearch.com.cn (2023).

  20. Abbas Akkasi, Job description parsing with explainable transformer based ensemble models to extract the technical and non-technical skills, Natural Language Processing Journal, Volume 9, 100102, ISSN 2949-7191, https://doi.org/10.1016/j.nlp.2024.100102 (2024).

  21. OpenAI. GPT-4 technical report . Retrieved from https://doi.org/10.48550/arXiv.2303.08774 (2023).

  22. Gozzi, M., & Di Maio, F. Comparative Analysis of Prompt Strategies for Large Language Models: Single-Task vs. Multitask Prompts. Electronics, 13(23), 4712 https://doi.org/10.3390/electronics13234712 (2024).

  23. Prottasha, N.J., Mahmud, A., Sobuj, M.S.I. et al. Parameter-efficient fine-tuning of large language models using semantic knowledge tuning. Sci Rep 14, 30667 https://doi.org/10.1038/s41598-024-75599-4 (2024).

  24. *Journal Article Author Author Author Title Journal Volume Issue Pages Year DOI URL Goldfarb, Avi Taska, Bledi Teodoridis, Florenta Artificial Intelligence in Health Care? Evidence from Online Job Postings AEA Papers and Proceedings 110 400–404 2020 10.1257/pandp.20201006 https://www.aeaweb.org/articles?id=10.1257/pandp.20201006

  25. Ibrahim Rahhal, Ismail Kassou, Mounir Ghogho, Data science for job market analysis: A survey on applications and techniques, Expert Systems with Applications, 251, 2024, 124101, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2024.124101.

  26. André Queirós, Daniel Faria, & Fernando Almeida. STRENGTHS AND LIMITATIONS OF QUALITATIVE AND QUANTITATIVE RESEARCH METHODS. Eur. J. Educ. Stud. 3(9) https://doi.org/10.5281/zenodo.887089 (2017).

  27. Tee, P. K., Wong, L. C., Dada, M., Song, B. L., & Ng, C. P. Demand for digital skills, skill gaps and graduate employability: Evidence from employers in Malaysia. F1000Research, 13, 389 https://doi.org/10.12688/f1000research.148514.1 (2024).

  28. Liudmila Alekseeva, José Azar, Mireia Giné, Sampsa Samila, Bledi Taska, The demand for AI skills in the labor market, Labour Economics, Volume 71, 2021, 102002, ISSN 0927-5371, https://doi.org/10.1016/j.labeco.2021.102002.

  29. Usharani Hareesh Govindarajan, Gagan Narang, Dhiraj Kumar Singh, Vinay Surendra Yadav, Blockchain technologies adoption in healthcare: Overcoming barriers amid the hype cycle to enhance patient care, Technological Forecasting and Social Change, Volume 213, 2025, 124031, ISSN 0040-1625, https://doi.org/10.1016/j.techfore.2025.124031.

  30. Alderwick, H., Hutchings, A., Briggs, A. et al. The impacts of collaboration between local health care and non-health care organizations and factors shaping how they work: a systematic review of reviews. BMC Public Health 21, 753 https://doi.org/10.1186/s12889-021-10630-1(2021).

  31. Zhang, M., Wu, S., Ibrahim, M. I., Noor, S. S. M., & Mohammad, W. M. Z. W. Significance of Ongoing Training and Professional Development in Optimizing Healthcare-associated Infection Prevention and Control. J. Med. Signals Sens. 14, 13 https://doi.org/10.4103/jmss.jmss_37_23 (2024).

  32. José António Porfírio, Tiago Carrilho, José Augusto Felício, Jacinto Jardim, Leadership characteristics and digital transformation, J. Bus. Res. 124, 610–619 https://doi.org/10.1016/j.jbusres.2020.10.058 (2021).

  33. Alqahtani, N., Wafula, Z. Artificial Intelligence Integration: Pedagogical Strategies and Policies at Leading Universities. Innov High Educ 50, 665–684. https://doi.org/10.1007/s10755-024-09749-x (2025).

  34. Sinha, R. The role and impact of new technologies on healthcare systems. Discov Health Systems 3, 96 https://doi.org/10.1007/s44250-024-00163-w (2024).

  35. Milella, F., Minelli, E. A., Strozzi, F., & Croce, D. Change and Innovation in Healthcare: Findings from Literature. ClinicoEconomics and outcomes research : CEOR, 13, 395–408 https://doi.org/10.2147/CEOR.S301169 (2021).

  36. LaFrance, D. L., Weiss, M. J., Kazemi, E., Gerenser, J., & Dobres, J. Multidisciplinary Teaming: Enhancing Collaboration through Increased Understanding. Behavior analysis in practice, 12(3), 709–726 https://doi.org/10.1007/s40617-019-00331-y (2019).

  37. Jette Ammentorp, Sarah Bigi, Jonathan Silverman, Marlene Sator, Peter Gillen, Winifred Ryan, Marcy Rosenbaum, Meg Chiswell, Eva Doherty, Peter Martin, Upscaling communication skills training – lessons learned from international initiatives, Patient Education and Counseling 104(2), 352–359, https://doi.org/10.1016/j.pec.2020.08.028 (2021).

Download references

Funding

Tianjin College Students’ Innovation and Entrepreneurship Training Program (Project Number: 202410063030).

Author information

Authors and Affiliations

  1. Tianjin University of Traditional Chinese Medicine, Tianjin, China

    Yirui Chen,Xinrui Zhan,Wencan Yang,Xueying Yan,Yuxin Du&Tieniu Zhao

Authors

  1. Yirui Chen

    View author publications

    You can also search for this author inPubMedGoogle Scholar

  2. Xinrui Zhan

    View author publications

    You can also search for this author inPubMedGoogle Scholar

  3. Wencan Yang

    View author publications

    You can also search for this author inPubMedGoogle Scholar

  4. Xueying Yan

    View author publications

    You can also search for this author inPubMedGoogle Scholar

  5. Yuxin Du

    View author publications

    You can also search for this author inPubMedGoogle Scholar

  6. Tieniu Zhao

    View author publications

    You can also search for this author inPubMedGoogle Scholar

Contributions

Y.C.: Conceptualization, methodology, data curation, formal analysis, writing—original draft, writing—review & editing, visualization. X.Z.: Conceptualization, methodology, data curation, formal analysis, writing—original draft, writing—review & editing, visualization. W.Y.: Data curation, formal analysis, writing—review & editing, visualization. X.Y.: Data curation, formal analysis, writing—review & editing, visualization. Y.D.: Data curation, formal analysis, writing—review & editing, visualization. T.Z.: Conceptualization, methodology, data curation, formal analysis, writing—original draft, writing—review & editing, visualization, project administration, funding acquisition.

Corresponding author

Correspondence to Tieniu Zhao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

A NLP analysis of digital demand for healthcare jobs in China (3)

Cite this article

Chen, Y., Zhan, X., Yang, W. et al. A NLP analysis of digital demand for healthcare jobs in China. Sci Rep 15, 14518 (2025). https://doi.org/10.1038/s41598-025-98552-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-025-98552-5

Keywords

  • Medical talent gap
  • Digital skills in healthcare
  • Natural language processing
  • Job analysis in China
A NLP analysis of digital demand for healthcare jobs in China (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Melvina Ondricka

Last Updated:

Views: 5961

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.