General information

EGI Core Services delivered in best effort way

  • Broadcast circulated on July 6th
  • Gap between EGI-ACE project (ended in June 2023) and the EOSC procurement that is going to fund them (starting from Jan 2024)
  • Delivery in maintenance mode, ensuring continuous operation and security system maintenance
    • bugs fixing and application of security patches
    • no implementation of new features
    • no major upgrades
    • expected slower response time to the tickets

Middleware


UMD

New infrastructure soon in production.

Operations

ARGO/SAM

  • Monitoring of xrootd endpoints
    • some endpoints are exposed outside the site in read-only mode
    • the new service type "eu.egi.readonly.xrootd" was created for this purpose (see GGUS 160848)
    • new version of the xrootd probe executing only "read" tests: to be added in UMD and deployed in ARGO (GGUS 163071)
  • New version of srm probe to be deployed (GGUS 162411) and to be included in UMD (GGUS 162424)
    • support for py3 only
    • support for SRM+HTTPS
    • updated default Top-BDII endpoint

FedCloud


Feedback from DMSU


New Known Error Database (KEDB)

The KEDB has been moved to Jira+Confluence: https://confluence.egi.eu/display/EGIKEDB/EGI+Federation+KEDB+Home

  • problems are tracked with Jira tickets to better follow-up their evolution
  • problems can be registered by DMSU staff and EGI Operations team


Monthly Availability/Reliability

Under-performed sites in the past A/R reports with issues not yet fixed:

Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (Oct 2023):

sites suspended: 

Verify configuration records

On a yearly basis, the information registered into GOC-DB need to be verified. NGIs and RCs have been asked to check them. In particular:

  1. NGI managers should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • ROD E-Mail
    • Security E-Mail

NGI Managers should also review the status of the "not certified" RCs, in according to the RC Status Workflow;

  1. RCs administrators should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • telephone numbers
    • CSIRT E-Mail

RC administrators should also review the information related to the registered service endpoints.

The process should be completed by Oct 6th.

List of tickets in the GGUS search page

  • 11 out of 30 tickets still open

Documentation

IPv6 readiness plans

Campaign to upgrade HTCondor to version 10 with SSL authentication enabled

  • In Feb 2022 OSG fully moved to token-based AAI, abandoning X509 certificates
  • HTCondorCE: replacement of Grid Community Toolkit
    • The long-term support series (9.0.x) from the CHTC repositories supported X509/VOMS authentication until May 2023
    • Starting in 9.3.0 (released in October 2021), the HTCondor feature releases does NOT contain this support
    • EGI sites were recommended to stay with the long-term support series for the time being
  • Now the sites can start the upgrade
  • Tickets to sites created at the beginning of November 2023

Enabling SSL authentication on HTCondor 9 and 10

The HTCondor team set-up an upgrade procedure to help sites and VOs with the migration from X509 personal certificates to tokens.

Essentially it was created an intermediate step where the plain SSL authentication can be used to authenticate a client' proxy, in addition to the GSI one or to the token one:

In summary, the steps are:

  • update to HTCondor 9.0.19
  • enable the SSL authz (with priority over GSI)
  • map the users' DNs
  • test the SSL authz successfully
  • update to HTCondor 10.6.0 or later
  • install and configure the Check-in plugin

Note the usage in the last step of the HTCondor Feature channel since it is the one supporting the EGI Check-in plugin from 10.4.0.

  • In this way the sites can accept clients’ proxies and tokens at the same time while waiting for the supported VOs moving completely to tokens.

The new HTCondor version not yet included in UMD (GGUS 162689). WLCG kindly set-up a dedicated repository for HTCondor 9.0.19.

Important for the sites:

  • Please start collecting information from the VOs you support about the DNs that should be mapped on your endpoints
  • Mapping for the ops VO - at least the following certificates:
    • EGI Monitoring Service:
      • "/DC=EU/DC=EGI/C=GR/O=Robots/O=Greek Research and Technology Network/CN=Robot:argo-egi@grnet.gr"
      • "/DC=EU/DC=EGI/C=HR/O=Robots/O=SRCE/CN=Robot:argo-egi@cro-ngi.hr"
    • EGI Security monitoring:
      • "/DC=EU/DC=EGI/C=GR/O=Robots/O=Greek Research and Technology Network/CN=Robot:argo-secmon@grnet.gr"

Important for the VOs:

  • update the condor-client as well in coordination with the sites

Monitoring:

Issues:

  • some issues with LHCB clients (v. 8.8.10) when SSL is used as a primary authentication. It works fine when on CE it is set SEC_CLIENT_AUTHENTICATION_METHODS=GSI,SSL. HTcondor devs contacted for some advice...

DPM Decommission and migration

  • Suppor of DPM ended in June 2023
    • CERN IT will provide a minimal support to DPM until the EOL of CentOS 7, with very little effort:
      • only critical issues will be looked into
  • DPM provides a migration script to dCache (migration guide)
  • In September 2022 opened tickets to the sites to plan the migration and decommission:
  • Migrations still pending
    • Australia-T2
    • BEIJING-LCG2
    • BG05-SUGrid (EOS)
    • CYFRONET-LCG2 (EOS)
    • GRIF (EOS)
    • IN2P3-IRES (DPM read-only from July, moving data to the new dcache server, see GGUS 163052.)
    • INFN-COSENZA (dCache)
    • INFN-FRASCATI (dCache)
    • INFN-ROMA1 (dCache)
    • NCP-LCG2 (dCache)
    • UKI-LT2-Brunel (XrootD/CEPHFS)
    • UNIBE-LHEP (dCache)
    • By Q3 2023

      • UKI-SCOTGRID-DURHAM (XrootD/CEPHFS)
    • By Q4 2023
      • PSNC (EOS)
    • By Q1 2024

      • UKI-NORTHGRID-MAN-HEP (XrootD/CEPHFS)
    • not clear/no reply
      • ATLAND
      • GR-07-UOI-HEPLAB
  • Please note that after June 30th no support is provided with the migration to dCache in case of issues.

New benchmark HEPscore23

The benchmark HEPscore23 is replacing the old Hep-SPEC06

Recent activities:

  • Some tests in particular with sites sending normalised reports were performed.
  • APEL client 1.9.2 released that adds basic HEPscore23 publishing using existing message format
    • It needs to be added to UMD
  • APEL server release candidate in testing
    • Liaising with Portal on setting up testing with them
    • this new version allows the aggregation of the accounting records by benchmark to monitor the move to the new benchmark over the time
    • When the tests are successful, final release of APEL server update and of the Portal
  • Information for testing the publication of accounting records with the new benchmark:
  • Expected a fix in ARC-CE for the proper configuration of HEPscore23
  • Please contact us if you'd like to make tests with the new benchmark
  • Investigating an issue between the Accounting Portal, the Accounting Repository, and AMS with retrieving the accounting records

HEPSCORE application:

April GDB:

June WLCG Operations Coordination meeting:

Monitoring of webdav and xrootd protocols/endpoints

AOB


Next meeting

December

  • No labels