Data Centre Operations Lead

VDart Inc

Not Interested
Bookmark
Report This Job

profile Job Location:

Rockville, MD - USA

profile Monthly Salary: Not Disclosed
Posted on: 4 hours ago
Vacancies: 1 Vacancy

Job Summary

Role: Data Centre Operations Lead.

Location: Rockville MD (Onsite).

Duration: 06 Months.

Job Description:

  • Lead the data center operations team providing guidance training and support to ensure high performance and operational excellence. Act as the primary point of contact for all data center-related issues and escalations.
  • Oversee the daily operations of data center facilities ensuring high availability and reliability of all systems.
  • Manage data center infrastructure technology stack end to end VMWare/VxRail/Citrix/Logic Monitor/Moog Soft/AD/Azure AD SSO Azure Security Policy/PKI/Windows & Linux Servers/Vulnerability management/Beyond Trust Password Safe and AD-Bridge/Storage & Backup tools etc.
  • Ensure adherence to operational standards and best practices.
  • Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
  • Lead mentor and manage a team of data center operation engineers.
  • Provide guidance and support for professional development and performance improvement.
  • Coordinate and manage the teams daily activities ensuring alignment with organizational goals and priorities.
  • Lead the response to data center incidents ensuring timely resolution and minimal impact on business operations.
  • Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
  • Develop and maintain incident management processes and procedures.
  • Plan and oversee scheduled maintenance and upgrades of data center infrastructure.
  • Ensure that all hardware and software components are up-to-date and functioning optimally.
  • Coordinate with vendors and service providers for maintenance and support activities.
  • Monitor and analyze data center resource usage ensuring efficient utilization and avoiding over-provisioning.
  • Conduct capacity planning to support future growth and demand.
  • Implement optimization strategies to enhance performance and reduce operational costs.
  • Ensure data center infrastructure adheres to security policies standards and best practices.
  • Implement and maintain security controls to protect data and systems.
  • Ensure compliance with regulatory requirements and industry standards (e.g. ISO 27001 HIPAA).
  • Develop and implement disaster recovery and business continuity plans for data center operations.
  • Ensure regular testing and validation of disaster recovery procedures.
  • Ensure data center infrastructure is resilient and can recover quickly from failures or disruptions.
  • Work closely with other IT teams business units and stakeholders to understand requirements and deliver solutions that meet their needs.
  • Collaborate with vendors and service providers to evaluate and integrate new technologies and services.
  • Communicate effectively with stakeholders providing regular updates on data center operations and performance.
  • Maintain comprehensive documentation of data center infrastructure configurations processes and procedures.
  • Generate regular reports on data center performance incidents and operational metrics.
  • Ensure documentation is up-to-date and accessible to relevant stakeholders.

Here are some technical responsibilities in detail.

Active Directory and Cloud Services

  • Administer Azure AD manage security groups GPO SSO and application configurations.
  • Handle public cloud directory services Oracle IDCS network/file shares SCP policies privileged user management and service account passwords.
  • Conduct AD audits schema updates backup/restore services and assist with JSOX FDA and GQS audits.
  • Manage ticket queues and follow up on aging tickets.
  • End-to-end support for Active Directory Domains (Azure AD AD security groups GPO SSO application configurations etc.

IT Environment Monitoring

  • 24x7 ITSM queue-based monitoring.
  • Triage and first-level troubleshooting based on alert severity.
  • Incident resolution using Standard Operating Procedures.

Vendor Coordination:

  • Coordinate with vendors for infrastructure on public/private Cloud.
  • Provide vendor contact details and escalation matrix.

Citrix Architecture and Optimization:

  • Maintain Citrix architecture and seek continuous optimization.
  • Participate in architecture design and planning with the steering committee.
  • Recommend system and end-user performance improvements.
  • Implement approved performance improvements.

Citrix Environment Support:

  • Support Citrix environment and integrate with Otsuka-specific technologies.
  • Order install update and maintain Citrix servers and tools.
  • Assess consolidate upgrade and manage Citrix infrastructure including SDX appliances.
  • Manage NetScaler infrastructure and upgrades.

IT Service Continuity and Disaster Recovery (DR) Services:

  • Strategy and Policy Definition
  • Coordination and Execution
  • Data Management
  • Testing and Reporting
  • DR Activation and Coordination
  • Review and Enhancement

Onsite and Remote Support:

  • Onsite server support IMAC services and remote software installation.
  • Decommissioning proactive evaluation and datacenter assessment.

Windows Server Management & Projects:

  • Administer and monitor Windows servers including health checks and problem management.
  • Manage local users groups shares and server disk/storage.
  • Handle event logs vendor coordination and performance issues.
  • Install and manage IIS apply security patches and troubleshoot clusters.
  • Oversee DNS SCOM certificate management migrations and server deployments.

Linux Server Administration and Projects:

  • User Administration - Manage user accounts environments and home directories.
  • OS Package Administration - Add/remove OS packages and troubleshoot issues.
  • Storage Management - Create/manage file systems logical volumes and clean up disk space.
  • NIS and NFS Management - Administer NIS tables and services install/configure NFS servers.
  • Network and Security - Configure/manage NTP DNS and implement security standards.
  • OS Upgrade and Patching - Upgrade/patch Linux OS configure SSSD and AD manage disk and security.
  • High Availability and Compliance - Build/configure HA environments enforce security and ensure regulatory compliance.
  • Server Builds and Management - Install/configure NIS mail DNS servers and centralized syslog servers.

DC Power Tools:

  • Tool Stack Logic Monitor MoogSoft Manage Engine Beyond Trust Password Safe Beyond Trust AD Bridge CommVault compliance Search Veritas Hubstor etc. Management and Support

Logic Monitor Administration:

  • Installation and Configuration - Install and configure LogicMonitor Collectors and group servers for monitoring.
  • Monitoring and Reporting - Configure monitoring settings create HLD/Templates/SOPs and integrate with Moogsoft.
  • Maintenance and Troubleshooting - Backup/restore LogicMonitor Collectors troubleshoot devices and modify LogicModules.
  • Consultancy and Coordination - Provide consultancy manage stakeholders oversee platform support and monitor infrastructure services.

Moogsoft Administration and Issues:

  • Integration and Event Management -Resolve Element Layer Tool integration issues and missing events/alarms at the Moogsoft layer.
  • Ticketing and Situation Formulation - Address ticketing problems with ITSM tools and inconsistencies in situation formulation/Cookbook.
  • Maintenance and Upgrades - Fix maintenance window malfunctions and perform Moogsoft module upgrades.
  • Configuration Management - Manage Moogsoft ReC Ipe additions/deletions/modifications and Cookbook enablement/disablement.
  • TeamRooms and API Integration - Create/modify/delete Moogsoft TeamRooms and integrate Moogsoft AI Operations with vendor APIs to automate ticketing.
  • Updates and Enhancements - Manage Moogsoft updates and enhancements.

Storage Backup & Data Management:

  • Define performance data segregation backup restore archival retention reliability encryption security scheduling and access control needs.
  • Recommend hierarchical storage solutions (shared/dedicated tiered storage platforms) and procedures to meet requirements and SLRs.
  • Review and approve storage and backup solutions and procedures.
  • Procure and manage data storage infrastructure (SAN NAS tape optical).
  • Provide and manage backup and archival consumables for Otsuka facilities.
  • Maintain data set placement manage data catalogs and configure Nimble SAN and NAS switches.
  • Notify Otsuka of any data losses or risks.
  • Perform data and file backups/restores per procedures and SLRs.
  • Manage file transfers data movement and input processing for third-party media.
  • Decommission storage and backup environments per policies.
  • Develop and maintain backup schedules manage backup media and ensure data retention.
  • Work with third-party vendors to archive data at secure offsite locations.
  • Conduct media testing to ensure data recovery capability and integrity.
  • Test end-to-end system recovery remediate flaws and coordinate with vendors.
  • Recover files/data as required provide recovery updates and manage data replication to DR sites.

Minimum Qualifications / Skills:

  • Bachelors degree in Computer Science Information Technology Electrical Engineering or a related field. Advanced degrees or relevant professional training are a plus.
  • Minimum 10 years of experience in data center operations with at least 5 years in a leadership or senior technical role.
  • Extensive experience in data center operations with a proven track record of managing large-scale data center environments.
  • Strong leadership and team management skills with the ability to motivate and develop a high-performing operations team.
  • In-depth knowledge of data center infrastructure including servers storage networking power and cooling systems.
  • Excellent problem-solving and analytical skills with the ability to diagnose and resolve complex technical issues.
  • Experience with incident and problem management change management and capacity planning.
  • Strong understanding of compliance security and regulatory requirements related to data center operations.
  • Effective communication and interpersonal skills with the ability to interact with stakeholders at all levels.
  • Experience in vendor management and contract negotiations.
  • A proactive approach to continuous improvement and innovation in data center operations.

Preferred Qualifications/ Skills:

  • Relevant certifications from Microsoft VMWare Citrix and Storage vendors are highly desirable.
  • Experience with ITIL or other IT service management frameworks.
  • Familiarity with cloud computing and hybrid data center environments.
  • Excellent communication and collaboration skills with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
  • Strong analytical and problem-solving skills with the ability to identify root causes of issues and implement effective solutions in a timely manner.
  • Proven ability to work independently as well as part of a team with a proactive and self-motivated attitude towards achieving project goals.
Role: Data Centre Operations Lead. Location: Rockville MD (Onsite). Duration: 06 Months. Job Description: Lead the data center operations team providing guidance training and support to ensure high performance and operational excellence. Act as the primary point of contact for all data center-r...
View more view more

Key Skills

  • English
  • Helpdesk
  • Asset Management
  • ABB
  • Data Mining
  • Control Engineering